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METHODS TO CONSTRUCT MULTIMERIC DNA AND POLYMERIC PROTEIN 
SEQUENCES AS DIRECT FUSIONS OR WITH LINKERS 



CROSS REFERENCE TO RELATED APPLICATONS 
5 This application claims priority to U.S. provisional application number US 

60/396,466, filed July 16, 2002, naming Stuart Bussell as inventor. 

SEQUENCE LISTING 
A sequence listing is provided in electronic and printed form and as an appendix 
10 to this application. 

BACKGROUND 

The present invention relates generally to recombinant DNA technology and 
recombinant protein expression, and more specifically, to constructs comprising repeat 
DNA sequences and to methods of making constructs comprising repeat DNA sequences, 

1 5 including constructs that encode polymer peptides and proteins, in which monomers are 
either fiised directly or with linkers. 

Recombinant proteins have become an important class of therapeutics and 
diagnostics since their introduction in the 1980s. The first recombinant protein 
therapeutics replaced products isolated from either animal or human tissue. For example, 

20 recombinant human growth hormone (recombinant human GH or rhGH) replaced 

material isolated from the pituitaries of human cadavers (Jorgenson, Endocrine reviews 
12:189, 1991). The need arose because of the transmission of a rare fatal disease, called 
Creutzfeldt-Jakob disease (CJD), that is transrnitted from impurities in pituitary derived 
hGH. The level of control possible with the recombinant version enabled production of 

25 drug certifiably free of known communicable agents. 

Another example of an early recombinant protein is recombinant human insulin 
(rhi) (Chien, Drug Development and Industrial Pharmacy 22:753, 1996). In this case, 
the recombinant product replaced, or supplemented, insulin isolated from the pancreases 
from swine and cattle. The recombinant protein exactly matches the one found naturally 

30 in humans, in contrast with the animal versions that differ by one to three amino acids. 
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More recombinant protein therapeutics followed including interferons, 
interleukins, hematopoetic factors, monoclonal antibodies, and others. 

hi the diagnostic field, antibodies, both natural and engineered, are used to 
recognize and signal the presence of clinical markers. An advantage of engineered 
5 antibody fragments over full-length antibodies is that they are amenable to production in 
facile expression systems such as E. coli or P. pastoris (Pennell et al.. Res Immunol 
149:599, 1998), 

Some of the in vivo characteristics of recombinant drugs are described by their 
pharmacokinetic parameters. The field of pharmacokinetics concerns itself with the 
10 absorption, distribution, metabolism, and excretion (ADME) of compounds delivered in 
vivo. Basically, pharmacokinetic parameters describe the concentration of a drug 
distributed throughout the body over time. 

Generally, absorption of protein drugs requires delivery by injection. A body's 
natural barriers tend to prevent the absorption of intact proteins if any other routes of 
15 delivery are used. The digestion system breaks down proteins administered orally, while 
the body's various epidermal surfaces prevent absorption throughout the body. 

Once injected, proteins tend to distribute throughout the circulatory system where 
' they can react (part of metabolism) with other molecules or undergo excretion. 
Mathematical models, of varying complexity, are available to explain experimental 
20 measurements of drug concentrations as a function of time. One of the basic 

pharmacokinetic parameters is a drugs half-life, ti/2, which is characteristic of the drug's 
duration in the bloodstreani. 

A key determinant to a protein's half-life in the blood is its size, and this is a 
result of elimination of proteins fi-om the blood by glomerular filtration in the kidneys 
25 (Venkatachalam et al. Circulation Research 43:337, 1978). Basically, the filtration 

allows proteins smaller than 60 kilodaltons (kD), and other similarly sized molecules, to 
pass out of the blood, resulting in urinary excretion, while retaining larger ones. This has 
a major impact on the dosing regimen for a given protein. Proteins smaller than 60 kD 
tend to need daily, or more frequent, injections. 
30 One strategy to minimize the discomfort and inconvenience of daily injections is 

to prolong the action of proteins once introduced in vivo. Two basic strategies are used. 
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One involves the formulation of the protein into a slow release formulation (Putney et al., 
Nature Biotechnology 16:153, 1998). An example of this technique involves formulating 
proteins into a biocompatible polymer, poly lactic co-glycolytic acid (PLGA), that 
dissolves slowly over time, releasing protein during the dissolution process. 
5 Recombinant hGH is one protein successfully formulated this way (Johnson et al.. Nature 
Medicine 2:795, 1996). A disadvantage of this technique that complicates its widespread 
application is the challenge of formulating and manufacturing each protein so that it is 
stable during processing and use. Furthermore, injections of PLGA formulated proteins 
can be uncomfortable. 

10 The other strategy to prolong a protein's in vivo action involves modifying the 

protein so that it acts like a larger particle and is excreted more slowly through the 
kidneys. While prolonging the proteins in vivo residence, the modification must avoid 
adverse consequences such as immunogenicity, toxicity, unwanted changes to the 
molecules distribution, and unwanted changes to its activity. 

15 A common technique in protein modification involves conjugating a native 

protein to polyethylene glycol (PEG) or another protein (Roberts et al., Adv Drug Deliv 
Rev 54:459, 2002). PEG molecules are manufactured at all ranges of molecular weights. 
They can be attached to reactive chemical groups compatible with chemical conjugation 
to proteins, and they are safe in vivo. Pegylated proteins have been approved for human 

20 use. Pegylated interferon is an example (Sharieff et al., Cleve Clin J Med 69:155, 2002). 
Pegylation effectively enhances the size of the resulting conjugate while avoiding 
immimogenicity or activity alterations. However, PEG has its own chemical and physical 
characteristics, and this can alter a conjugates ADME. For example, PEG alters the 
distribution of IL2 in such a way as to unacceptably increase its toxicity (Chen et al.. The 

25 Journal of Pharmacology and Experimental Therapeutics 293:248, 2000). Also, the 
chemical conjugation is difficult to completely control, and any resulting conjugate is 
likely to be a mix of chemical species. 

Another promising technique involves conjugating or fusing proteins to a carrier 
protein. There are many examples of chimeric molecules formed either through chemical 

30 reaction between the parent proteins or through the fiision of their gene sequences. In the 
case of fusion proteins, experience shows that the separate polypeptides constituting a 
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fusion protein generally fold into their three dimensional conformation independently. In 
fact, often a recombinant protein that misfolds during expression in E. coli by itself will 
fold properly when fused to a protein that regularly folds correctly. Examples include 
fusions to commercially available proteins such as GST and NusA (see for example 
5 Novagen, Madison, WI). 

. One technique to make therapeutic fusion proteins is to fuse native therapeutics to 
human serum albumin (HSA) (U.S. Pat. No. 5,876,969). HSA is a 66 kD protein that is 
abundant in the human bloodstream. It is non-inunimogenic and readily available. 
Potential problems include changed distribution of any resulting conjugate and the effect 

10 of HSA as it is shuttled into ceils that normally do not contain it intracellularly. 

Another technique is to make therapeutic homomultimer fusion proteins. In this 
case, the coding DNA sequence for a functional protein is connected to copies of itself 
A dimer of superoxide dismutase ("SOD") is disclosed in U.S. Pat. No. 5,084,390, 
whereby the hinge region of an inununoglobin joins two copies of the SOD monomer. 

15 The resulting dimer has an extended in vivo half-Ufe. In another example, a dinier of 
erythropoietin is disclbsied in U.S. Pat. No. 6,242,570. 

Methods to manufacture highly polymerized sequences; for example polymers 
having greater than two units, have been developed in the field of artificial protein 
polymers. Lewis et al {Protein Expression and Purification 7:400, 1996) reveal a method 

20 utilizing compatible, but nonregenerable, overhang restriction sites that are engineered to 
allow the polymerization of a monomeric spider silk repeating sequence in a geometric 
fashion. In similar manner, Elmorani, et al. {Biochemical and Biophysical Research 
Communication 12^9:2^0^ 1997) use compatible, but nonregenerable, blunt end restriction 
sites to produce a polymeric form of wheat gliadin. 

25 The techniques disclosed in both cases are predicated on the preseiice pf a pair of 

compatible, nonregenerable, restriction sites at the end of the polymerizing protein 
sequence. This requirement severely limits the number of sequences that are amenable to 
polymerization. Another disadvantage of currently available methods is that once a final 
polymeric sequence is generated, the researchers must employ additional steps to 

30 engineer it with the appropriate 5' and 3 ' sequences for expression. 
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SUMMARY OF THE INVENTION 



The present invention provides methods to easily and quickly generate multimers, 
such as dimers and higher order multimers, of DNA sequences and their open reading 
5 frame protein translations, resulting in constructs for the expression of proteins of greater 
molecular weight and valency. Methods are described whereby a sequence is attached to 
one or more versions of itself, either via a direct fusion or with a linker, where each 
version shares strong homology and is generally considered the same via its sequence and 
mode of action. In addition, the multimer is attached to temiinal functional elements. 

10 The monomer can theoretically have any sequence and can consist of elements from one 
or more genes or synthetic DNA fragments. Thus, although the polymerization employs 
homornultimers, the fundamental monomers themselves can be generated from 
heterogeneous sequences. Furthermore, heteromultimers can be produced from 
monomers previously rnanipulated with the methods of this invention if the constitutive 

15 monomers have compatible ends. 

In one aspect, the present invention comprises multimer assemblies of cassettes 
that comprise nucleic acid sequences having restriction sites that can be ligated together 
to form constructs (multimer cassettes) having multiple copies of a sequence of interest 
(the monomer sequence), such as a sequence that encodes a peptide or protein, 

20 Restriction sites used to ligate cassettes of a multimer assembly together to make a 

multimer cassette comprise restriction pair members that when ligated together, do not 
regenerate a restriction site. In one embodiment of the present invention, multimer 
asseniblies are used that comprise 1) at least one amplification cassette comprising at 
least a monomer sequence and 2) at least one 3 '-terminal cassette comprising at least one 

25 3' specific sequence or at least one 5 '-terminal cassette comprising at least one 5' specific 
sequence. Preferably, the 5 '-terminal and/or 3 '-terminal cassettes additionally comprise 
at least a portion of the monomer sequence. 

In some preferred embodiments of this aspect of the invention, component 
cassettes (such as amplification cassettes, 5 '-terminal and/or 3 '-terminal cassettes) of a 

30 multimer assembly can comprise one or more flanking restriction sites that can facilitate 
cloning of multimer cassettes. 
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In some preferred embodiments, component cassettes (such as amplification 
cassettes, 5 '-terminal and/or 3 '-terminal cassettes) can comprise one or more linker 
sequences, such as Hnker sequences that encode amino acids or peptides that can be used 
to link monomers. Such linker sequence can also comprise restriction sites, such as 
5 restriction pair members that can be used in making multimer cassettes. 

In another aspect, the present invention provides methods of making multimer 
cassettes. Such methods include ligation of 3' and 5' restriction pair members of 
component cassettes, hi some preferred embodiments, the synthesis of multimer cassettes 
can optionally make use of flanking restriction sites that can be provided in the 

10 component cassettes. In some preferred embodiments, the synthesis of multimer cassettes 
can optionally make use of restriction sites that can be provided in linker sequences 
included in one or more component cassettes. 

The protein polymers encoded by DNA multimers of a multimer cassette can be 
expressed in any suitable gene/protein expression system. For example, prokaryotic or 

15 eukaryotic systems are suitable, as iare in vitro translation systems. The multimer 

assembly system described here facilitates the multimerization process and enables the 
production of multimers of any size and with a variety of N-terminal, linker, and C- 
terminal elements from a limited number of starting DNA sequences. For example, a 
gene can be designed for intracellular expression with an N-terminal methionine and for 

20 extracellular expression by including a secretory signal sequence after the N-terminal 
methionine. 

The invention can be used to produce constructs having multimeric or polymeric 
sequences of increased size and multiplicity. 

25 
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BRIEF DESCRIPTION OF THE DRAWINGS 



Fig, 1 is a diagram showing an example of a multimer assembly and its cassettes for 
monomers having a terminal restriction pair. (A) shows a 5 '-terminal cassette with 
5 sequence elements coding for protein N-terminal elements. The crosshatched elements 
are restriction sites, the rectangular segments are portions of the monomer sequence, the 
looping arrows indicate continuation as a plasmid, straight arrows indicate linker 
sequences, and - refers to arbitrary DNA sequences. The circle is a start codon, and the 
sqiiare is a 5' specific sequence. Restriction site 1 can include the start codon and/or can 

10 be a flanking restriction site for cloning flexibility. Restriction site 3 is the 3' restriction 
pair member, and 2 and 4 are flanking restriction sites for cloning flexibihty. (B) shows 
an amplification cassette with sequence elements coding for a polymerizing sequence. 
Restriction site 5 is the 5' restriction pair member. (C) shows a 3 '-terminal cassette with 
sequence elements coding for C-terminal elements. The pentagon represents 3' specific 

1 5 sequence and the hexagon a stop codon. The restriction site arrangement is preferred, but 
not the only arrangement for construction of an insert cassette. (D) shows one example 
of a Linker sequence. As shown here, it can contain elements 5' and 3' of the restriction 
pair formed by ligating restriction sites 5 and 3 together. The left and right arrows 
represent linker 5' and 3' elements, respectively. 

20 

Fig. 2 is a diagram showing one example of a multimer assembly and its cassettes for a 
monomer with an internal restriction pair. The crosshatched elements are restriction 
sites, the rectangular segments are portions of the monomer sequence, the looping arrows 
indicate continuation as a plasmid, straight arrows indicate linker sequences, and ~ refers 
25 to arbitrary DNA sequences. The circle is a start codon, and the square is a 5' specific 
sequence. The pentagon represents 3' specific sequence and the hexagon a stop codon. 

(A) shows a 5 '-terminal cassette with sequence elements coding for N-terminal elements. 

(B) shows an amplification cassette with sequence elements coding for the polymerizing 
sequence. The double arrow represents a linker (optional). (C) shows a 3 '-terminal 

30 cassette with sequence elements coding for C-terminal elements. (D) shows an 
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alternative 3 '-terminal cassette that requires use of sequential ligation to fomi a multimer 
expression cassette. 

Fig. 3 is a diagram showing two examples of pathways that can be used in the 
5 polymerization of amplification cassettes. Both procedures depicted involve two 

generalized cassettes, one with insert sequence bl and the other with insert sequence b2. 
For pathway A, the b2 containing cassette is opened by digesting with enzymes 1 and 5. 
The bl insert sequence is isolated after digesting the bl containing cassette with enzymes 
1 and 3. For pathway B, the bl containing cassette is opened by digesting with enzymes 
10 2 and 3. The b2 insert sequence is isolated after digesting the b2 containing cassette with 
enzymes 2 and 5. The final ligations to generate multimer assembhes are similar for both 
cases. The crosshatched elements are restriction sites, the rectangular segments are insert 
sequences, the looping arrows indicate continuation as a plasmid, and ~ refers to 
arbitrary DNA sequences, 

15 

Fig. 4 is a diagram showing examples of sequential ligation of cassettes to create a 
fiinctional multimer cassette of a desired size. The schematic is a generalization of the 
sequential ligation procedure necessary for use with a 3 '-terminal cassette given in Figure 
2D. Pathway A depicts the insertion of an 'S' plasmid fi-agment into a T' containing 

20 plasmid, while Pathway B depicts the insertion of a 'T' plasmid fi^agment into a 'S' 

containing plasmid. In the figure, S + T = 51 + AI, AI + 31, 5IAI + 31, or 51 + AI3I, where 
51 = the insert fi'om a 5 '-terminal cassette, AI = the insert from an amplification cassette, 
31 = the insert from a 3 '-terminal cassette, 5IAI = the insert resulting from the Ugation of 
51 and AI, AI3I = the insert resulting from the ligation of AI with 31, and 5IAI3I = the 

25 insert resulting from the ligation of 51 with AI3I or 5IAI with 31. Formation of 5IAI3I 

requires two sequential ligations and generation of intermediate 5IAI or AI3I cassettes for 
each polymer size made. The crosshatched elements are restriction sites, the rectangular 
segments are insert sequences, the looping arrows indicate continuation as a plasmid, and 
- refers to arbitrary DNA sequences. 

30 
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Fig. 5 is a diagram showing possible methods for generation of an insertion cassette. 
Pathways A and B are ahemative pathways for insertion cassette generation based on 
different arrangements of flanking restriction sites. Pathway A involves opening the 5'- 
terminal cassette and inserting a fragment from the 3 '-terminal cassette, while Pathway B 
5 involves opening the 3 '-terminal cassette and inserting a fragment from the 5 '-terminal 
cassette. The crosshatched elements are restriction sites, the rectangular segments are 
portions of the monomer sequence, the looping arrows indicate continuation as a plasmid, 
straight arrows indicate linker sequences, and refers to arbitrary DNA sequences. The 
circle is a start codon, and the square is a 5' specific sequence. The pentagon represents 
10 3' specific sequence and the hexagon a stop codon. 

Fig. 6 is a diagram showing one possible method of generating a functional multimer 
cassette of a desired size from an insertion cassette and an amplification cassette. The 
insertion cassette is opened at both sites of the restriction pair with subsequent ligation of 

1 5 the insert from an amplification cassette, but the insert can ligate in the wrong orientation. 
Correct inserts must be identified by subsequent analysis. The crosshatched elements are 
restriction sites, the rectangular segments are portions of the monomer sequence, the 
looping arrows indicate continuation as a plasniid, straight arrows indicate linker 
sequences, and - refers to arbitrary DNA sequences. The circle is a start codon, and the 

20 square is a 5' specific sequence. The pentagon represents 3' specific sequence and the 
hexagon a stop codon. 

Fig. 7 is a diagram showing another possible method of generating a ftinctional multimer 
cassette of a desired size from an insertion cassette and an amplification cassette. The 

25 insertion cassette is opened with enzymes 3 and 2 to create an oriented ligation, but an 
additional step is required, hi this case, the amplification cassette has flanking restriction 
site 2 on the 3' side of restriction site 3. The crosshatched elements are restriction sites, 
the rectangular segments are portions of the monomer sequence, the looping arrows 
indicate continuation as a plasmid, straight arrows indicate linker sequences, and refers 

30 to arbitrary DNA sequences. The circle is a start codon, and the square is a 5' specific 
sequence. The pentagon represents 3' specific sequence and the hexagon a stop codon. 
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Fig. 8 is a diagram showing another possible scheme for generating a functional multimer 
cassette of a desired size from an insertion cassette and an amplification cassette in 
similar fashion to Figure 7, but the amplification cassette has flanking restriction site 2 
5 on the 5' side of restriction site 5. 

Fig. 9 is a diagram showing the PGR amplification of the hGH gene, its subsequent 
ligation to generate pOAO, and the ligation of the OmpA leader sequence to generate 
p0C0A2. 

10 

Fig. 10 is a diagram showing the PGR mutagenesis of the hGH gene to generate pOAOl. 
The diagram also shows the ligation of the OmpA sequence into pOAOl to generate 
pOAl 1 A2 and the Hgation of the Pstl/BamHI fragment from pOAOl into P0A03 to 
generate pOAl lAl . 

15 

Fig. 11 is a diagram showing the PGR mutagenesis of the hGH gene to generate pOAllB. 

Fig. 12 is a diagram showing the Hgation of synthetic sequences to generate pOAl IGl 
andpOAllG2. 

20 

Fig. 13 is diagram showing the polymerization of a GH direct fusion amplification 
cassette. 

Fig. 14 is diagram showing the generation of the GH direct fusion insertion cassette, 
25 pOAl ID, and subsequent ligation of an amplification cassette to generate a multimer 
expression cassette. 

Fig. 15 is a diagram showing the PGR mutagenesis of the hGH gene to generate pOA21B, 
the base amplification cassette for the GH glycine linker assembly. 

30 
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Fig. 16 is a diagram showing the PGR mutagenesis of the hGH gene to generate the base 
cassettes, pOA31A, pOA31B, and pOA31C, for the GH SWG4S assembly. 

Fig. 17 is a diagram showing the sequential ligation of the GH SWG4S assembly 
5 cassettes to generate the multimer expression cassette, pOA3 1E3. 

Fig. 18 is a picture of an SDS-PAGE gel showing the separation of proteins by molecular . 
weight from separate lysates from cells expressing different polymers of rhGH. Lane 1 
contains molecular weight standards, lane 2 the rhGH monomer, lane 3 the rhGH dimer, 
10 lane 4 the rhGH trimer, lane 5 the rhGH pentamer, and lane 6 the rhGH nanamer. 

Fig. 19 is a diagram showing insertion of synthetic sequences to generate the G4S 
assembly 5 '-terminal and ampHfication cassettes. 

15 Fig. 20 is a diagram showing PGR mutagenesis of the hGH gene to generate pOA04 and 
pOA4lG. 

Fig, 21 is a diagram showing ligation of the insert from pOD13A with pOA04 to generate 
pOA43B and hgation of the Pstl/EcoRI fragment from pOAl 1 Al to generate pOA43A. 

20 

Fig. 22 is a diagram showing ligations to generate the base cassettes; pOASlA, pOA51B, 
and pOA5 IG, for the GH direct fusion assembly utilizing blunt ended Hindlll and Ncol 
sites for the restriction pair. 

25 Fig. 23 is a diagram showing the polymerization of the pOA5 IB insert to generate 
pOA51B2. 
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DETAILED DESCRIPTION OF THE INVENTION 

Introduction 

The current invention discloses methods that extend the polymerization 
techniques in three important ways. First, it introduces new methods to generate highly 
5 polymerized sequences from monomers that are incompatible with previous protein 
polymerization techniques. Second, it introduces additional linker sequences that, when 
paired with the monomer sequences, facilitate their use. Third, it introduces methods that 
facilitate the construction and expression of functional multimers and polymers. Taken 
together, the new methods enable the generation of large numbers of polymer variants 

10 that can differ in sequence and degree of polymerization. These variants can then be 
tested for desirable traits. 

The disclosed techniques are applicable to any polypeptide sequence and can 
prove usefiil for proteins for which increased total molecular weight is deemed 
advantageous. The disclosed techniques are also useful for proteins for which increased 

15 valency is deemed advantageous. For example, expression of single chain antibody 
fragments fused together as larger multimers have the advantage of high valency and a 
stable linkage. Furthermore, if cassettes for two different sequences share compatible 
restriction pair members, they can be co-polymerized to produce heteromultimers. 

20 Definitions 

Unless defined otherwise, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Conventional methods are used for these procedures, such as those 
provided in the art and various general references. Where a term is provided in the 

25 singular, the inventors also contemplate the plural of that term. The nomenclature used 
herein and the laboratory procedures described below are those well known and 
commonly employed in the art. Where there are discrepancies in terms and definitions 
used in references that are incorporated by reference, the terms used in this invention 
shall have the definitions given herein. As employed throughout the disclosure, the 

30 following terms, unless otherwise indicated, shall be understood to have the following 
meanings: 
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Monomer . A DNA or amino acid sequence whose polymerization is desirable. A 
monomer can be a portion of a naturally occurring sequence (for example, a binding 
domain of an antibody). The sequence can be derived from one or more naturally 
occurring ones, or can be a synthetic sequence, or can be any combination of sequences 
5 of syiithetic and natural origins. Monomers of the present invention can comprise 
linkers. As used herein monomer sequence means a nucleic acid sequence. 

Multimer . A nucleic acid sequence encoding two or more monomers. 

10 Polymer or Multimeric protein . A functional polypeptide that can be synthesized from a 
multimer assembly of the present invention. A polymer comprises at least two monomers 
(where each monomer can optionally comprise one or more linkers), can comprise one or 
more 5' translated regions (for exmnple, signal peptides, N-terminal regions, "pro" or 
"pre" protein sequences, tag sequences, etc.), and can comprise one or niore 3' translated 

1 5 regions (for example, C-terminal regions, tag sequences, etc.) 

Linker. A linker is a DNA or amino acid sequence that connects one DNA sequence with 
another through covalent bonds or an amino acid or peptide that connects one peptide or 
protein unit with another peptide or protein unit through peptide bonds. An amino acid or 
20 peptide linker can be a single amino acid (for example, glycine) or can be more than one 
amino acid. 

Restriction Pair . Two restriction sites that have different recognition sequences that are 
ligation compatible, but when ligated together do not regenerate either of the two original 

25 restriction sites. A restriction pair can include two restriction sites that have overhangs, 
such as Bglll and BamHI, or can include any two blunt end restriction sites that do not 
have the same recognition sequence, such as StuI and Nael. In a broader application, a 
restriction pair can also include restriction sites that are initially ligation incompatible but 
are blunt ended to make them ligation compatible. An example includes blunt ending 

30 Hindlll and Ncol to make them ligation compatible. 
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Restriction pair member or restriction member . A restriction site that is part of a 
restriction pair. The 5' and 3' restriction pair members together make up a restriction 
pair, and each is the other's partner. 



5 5' restriction pair member or 5' restriction member or 5' member . A restriction pair 
member that is located at the 5' terminus of a DNA sequence, such as a DNA sequence 
that, at least in part, encodes a monomer whose multimerization is desired or multimer of 
the present invention, or is located at the 5' terminus of a DNA sequence of interest 
whose ligation to a multimer is desired. The term "5* restriction pair member" or "5' 
10 member" can be used to refer to an unaltered restriction site (for example, a Bam HI site) 
or to a restriction site that has been altered, such as, for example, a fiUed-in 5' restriction 
pair member (such as blunt ended Bam HI site), or a fused 5' restriction pair member (for 
example, a ligated BamHI/Bglll site). 

15 3' restriction pair member or 3' restriction member or 3' member . A restriction pair 
member that is located at the 3' terminus of a DNA sequence, such as a DNA sequence 
that, at least in part, encodes a monomer whose multimerization is desired or multimer of 
the present invention, or is located at the 3' terminus of a DNA sequence of interest 
whose ligation to a multimer is desired. The term "3' restriction pair member" or "3' 

20 member" can be used to refer to an unaltered restriction site (for example, a Bglll site) or 
to a restriction site that has been altered, such as, for example, a fiUed-in 3' restriction 
pair member (such as blunt ended Bglll site), or a fused 3' restriction pair member (for 
example, a ligated BamHI/Bglll site). 

25 Flanking restriction site or flanking site . A restriction site that is not a member of a 

restriction pair used in the constructs and methods of the present invention. Its location 
outside of insert sequences and restriction pair members used in the cassettes and 
methods of the present invention can facilitate manipulation of the insert. 
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Insertion restriction site. A specific flanking restriction site that is 3' of the 3' restriction 
pair member of the 5 '-terminal cassette and 5' of the 5' restriction pair member of the 3'- 
terminal cassette. 



5 Amplification cassette . A DNA sequence that includes at least one monomer that is 

flanked by a restriction pair. An amplification cassette has a 5' restriction pair member at 
its 5' terminus and a 3' restriction pair member at its 3 ' terminus. The restriction pair 
enables the multimerization of the sequence or the ligation of it to other sequences with 
hgation compatible restriction sites. An amplification cassette can optionally comprise 

10 other sequences as well, such as but not limited to sequences that code for amino acid or 
peptide linkers. 

5 '-terminal cassette . A DNA sequence that comprises a 3' restriction pair member, at 
least one 5 '-specific sequence, where a 5 '-specific sequence is a sequence that, ^yhen 

15 positioned at the 5' end of a multimer sequence, can facilitate the use of DNA multiniers 
or the expression, purification, or identification of at least one protein polymer of the 
present invention, and, preferably, at least a portion of a monomer sequence. The 3' 
restriction pair member is ligation compatible with the 5' terminus of at least one 
amphfication cassette. The 5 '-terminal cassette is useful for introducing 5 '-terminal 

20 DNA sequences that contribute to making a sequence functional. Examples of 5 ' specific 
sequences include, but are not limited to, the translation start codon, secretion sequences, 
tag sequences, linker sequences, or special restriction sites. 

3 '-terminal cassette . A DNA sequence that comprises a 5' restriction pair member, at 
25 least one 3'-specific sequence, where a 3'-specific sequence is a sequence that, when 

positioned at the 3' end of a multimer sequence, can facilitate the use of DNA multimers 
or the expression, purification, or identification of at least one protein polymer of the 
present invention, and, preferably, at least a portion of a monomer sequence. The 5' 
restriction pair member is Ugation compatible with the 3' terminus of at least one 
30 amplification cassette. The 3 '-terminal cassette is useful for introducing 3 '-terminal 

DNA sequences that contribute to making a sequence functional. Examples of 3' specific 
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sequences include, but are not limited to, tag sequences, C-terminal sequences, 
polyadenylation sequences, stop codons, linker sequences, and the like. 



Insert sequence. The functional sequence in a cassette. For the amplification cassette, 
5 the functional sequence includes both restriction pair members and all sequence in 
between, including the monomer sequence. For the 5 '-terminal cassette, the functional 
sequence includes the 3' restriction pair member, all 5 '-specific sequences, and its 
portion of a monomer sequence, if present. For the 3 '-terminal cassette, the functional 
sequence includes the 5' restriction pair member, all 3'-specific sequences, and its 
1 0 portion of a monomer sequence, if present. For multimer cassettes, the functional 
sequence includes the functional sequences of the constitutive cassettes. 

Multimer assembly . The collection of all cassettes that, in combination, after ligation, 
yields functional multimer DNA sequences or polymer protein sequences of a starting 
1 5 monomer. A multimer assenibly comprises one or more 5 '-terminal cassettes and one or 
more amplification cassettes; one or more ampliiScation cassettes and one or more 3'- 
terminal cassettes; or one or more 5 '-terminal cassettes, one or more amplification 
cassettes, and one or more 3 '-terminal cassettes that can be fused using 3' and 5' 
restriction pair members. 

20 

Multimer cassette . A cassette resulting from the ligation of two or more cassettes from 
the same multimer assembly. 

Insertion Cassette . A multimer cassette generated from the ligation of a 5 '-terminal and 
25 3 '-terminal cassette of a multimer assembly that is ligation compatible with any of said 
assembly's amplification cassettes to generate a multimer cassette. 

Multimer expression cassette . A multimer cassette that, when transcribed and translated 
in a suitable expression system, produces a polymer protein sequence of a starting 
30 monomer. 
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Segment of a monomer sequence . A segment of a monomer sequence is a portion of 
monomer sequence, that is, a nucleic acid sequence that encodes a portion of a monomer. 

L METHODS OF MAKING MULTIMER ASSEMBLIES 

5 

The present invention includes methods of fusing two or more nucleic acid 
sequences. The nucleic acid sequences can encode for peptide or protein sequences, such 
that when the nucleic acid sequences are expressed, a polymeric protein is produced. 
Preferably, in the methods of the present invention, the peptide or protein monomers 

10 encoded by the nucleic acid sequences are identical peptide or protein monomers. 

However, this is not a requirement of the present invention. The nucleic acid sequence, 
whose polymerization is desired is called a monomer sequence. 

Monomer sequences can encode proteins or peptides whose function is known or 
unknown. Preferably, however, the identity and function of the peptide or protein 

1 5 encoded by a monomer sequence is known. Of particular interest are peptides and 
proteins that can have diagnostic or therapeutic value (for example, human growth 
horaione, hGH), although the invention is not limited to these protein sequences. 

For example, monomer sequences can encode at least a portion of one or more 
receptors, receptor ligands, enzymes, inhibitors, transcription factors, translation factors, 

20 DNA replication factors, activators, chaperonins, or antibodies. Monomer sequences can 
also encode at least a portion of one or more cytokines, growth factors, or hormones such 
as, but not limited to, Interferon-alpha, Interferon-beta, Interferon-gamma, Interleukin-1, 
Interleukin-2, Interleukin-3, Interleukin-4, Interleukin-5, Interleukin-6, Interleukin-7, 
Interleukin-8, Interleukin-9, Interleukin-10, Interleukin-1 1, Interleukin-1 2, Interleukin- 

25 13, Interleukin-1 4, Interleukin-1 5, Interleukin-1 6, Erythropoietin, Colony-Stimulating 
Factor- 1, Granulocyte Colony- stimulating Factor, Granulocyte-Macrophage Colony- 
Stimulating Factor, Leukemia Inhibitory Factor, Tumor Necrosis Factor, Lymphotoxin, 
Platelet-Derived Growth Factor, Fibroblast Growth Factors, Vascular EndotheUal Cell 
Growth Factor, Epidemial Growth Factor, Transforming Growth Factor-beta, 

30 Transforming Growth Factor-alpha, Thrombopoietin, Stem Cell Factor, Oncostatin M, 
Amphiregulin, Mullerian-Inhibiting Substance, B-Cell Growth Factor, Macrophage 
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Migration Inhibiting Factor, Endostatin, and Angiostatin. Descriptions of these proteins 
can be found in Human Cytokines: Handbook for Basic and Clinical Research, Aggarwal, 
B, B. and Gutterman, J. U. Eds., Blackwell Scientific Publications, Boston, Mass., 
(1992), which is herein incorporated by reference in its entirety. 
5 The monomer encoding sequences are polymerized together by ligation of 

compatible, nonregenerable restriction sites, called restriction pair members. Unlike 
previous methodologies, the present invention employs cassettes with sequences other 
than those encoding the original monomer itself in the construction process. For 
example: 

10 In the methods of the present invention, multimer assemblies are used that 

comprise at least one amplification cassette and at least one of the following: at least one 
3 '-terminal cassette or at least one 5 '-terminal cassette. An amplification cassette 
comprises an insert sequence that includes a monomer sequence whose polymerization is 
desired, a 5' restriction pair member at its 5' terminus, and a 3' restriction pair member at 

15 its 3' terminus. A 3'-terminal cassette comprises an insert sequence that includes at least 
one 3' specific sequence and a 5' restriction pair member site that can be fused to a 3' 
restriction pair member site of at least one of the one or more amplification cassettes. A 
5'-terminal cassette, comprises an insert sequence that includes at least one 5' specific 
sequence and a 3' restriction pair member site that can be fiised to a 5' restriction pair 

20 member site of at least one of the one or more amphfication cassettes. Preferably, the 5'- 
terminal and/or 3 '-terminal cassettes additionally comprise at least a portion of the 
monomer sequence. 

5' specific sequences can be, but are not limited to, sequences that enhance 
transcription, translation, secretion, protein folding, protein solubility, or binding of the 

25 protein to specific binding members such as antibodies. 3' specific sequences can be, but 
are not limited to, stop codons or sequences that enhance RNA stability, protein folding, 
protein solubility, or binding of the protein to specific binding members such as 
antibodies. 

In the multimer assemblies of the present invention, 5' and 3' restriction pair 
30 members are used to fiise amplification cassettes, and preferably, where appUcable, 3'- 
terminal cassettes to amplification cassettes and 5 '-terminal cassettes to amplification 
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cassettes. 5' and 3' restriction pair members are preferably unique restriction sites that 
are ligation compatible, and said ligation destroys each member. In the alternative, 5' 
and 3' restriction pair members can be ligation incompatible sites that are made ligation 
compatible by blunt ending. 
5 One aspect of the present invention is construction of cassettes comprising one or 

more flanking restriction sites that aid their use, but this is not a requirement of the 
present invention. Preferably, 3'-terminal cassettes and 5'-terminal cassettes, if present, 
comprise 3' and 5' flanking restriction sites, Flanking restriction sites can be any 
restriction site (except restriction pair member sites used in the same construct), and 

10 preferably aid the use of cassettes by increasing the facility of making multimer cassettes. 
For example, the flanking sites facilitate the manipulation of the insert sequences, 
including their isolation and ligation. For example, some preferred methods employ an 
insertion restriction site, which is a specific flanking restriction site that is 3' of the 3' 
restriction pair member of the 5 '-terminal cassette and 5 ' of the 5 ' restriction pair 

1 5 member of the 3 '-terminal cassette. Flanking restriction sites can also optionally be used 
to transfer constructs and assemblies to different expression vectors 

III some preferred rnethods of the invention, sequences encoding linkers are 
employed. Multimer assembly cassettes can comprise one or more linker sequences. 
Multimer assernbly cassettes can have linker sequences 5' of one or more insert 

20 sequences, 3' of one or rnore insert sequences, or both 5' and 3' of one or more insert 

sequences. Linker sequences can be part of amplification cassettes, 5 '-terminal cassettes, 
3 '-terminal cassettes, or any combination thereof In preferred aspects of the present 
invention, nucleic acid sequences that encode amino acid or peptide linkers that are used 
to link monomers can also comprise restriction sites, such as 3' or 5 ' restriction pair 

25 member sites that can facilitate construction of multimer assemblies. This provides a 

convenient means for introducing restriction pair members for efficient polymerization of 
monomer sequences through amphfication cassettes arid optionally 5 '-terminal cassette 
or 3 '-terminal cassette ligations. Altematively, or in addition, amino acid or peptide 
linkers can be used to provide optimal spacing or folding of translated monomers or a 

30 polymer. 
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Where more than one Hnker sequence is used in a single muhimer assembly cassette, 
they may or may not occur between each and every monomer sequence. Where more 
than one linker sequence is used in a single multimer assembly cassette, they can encode 
the same or different amino acid or peptide linkers. 
5 Peptide linkers are well known in the art. Preferably linkers are between one and 

twenty amino acids in length, and more preferably between one and ten amino acids in 
length, although length is not a limitation in the Knkers of the present invention. 
Preferably linkers comprise amino acid sequences that do not interfere with the 
conformation and activity of peptides or proteins encoded by monomers of the present 

10 invention. Some preferred linkers of the present invention are those that include the 
amino acid glycine. Examples include those disclosed in Table 1 . 

In an expressed protein polymer, such amino acid or peptide sequences join 
peptide or protein nionoiher sequences. If a Unker is part of the insert seqiience of the 
amphfication cassette, it becomes part of the monomer that is to be multimerized. The 

1 5 linker sequence can comprise at least one restriction pair member. 

The present invention also introduces several methods to expand the use of 
restriction pair member sites. For example: 

In some methods of the present invention, restriction pair members that are used to 
join monomer sequences are internal to a monomer sequence. In these embodiments, an 

20 amplification cassette comprises a 5' segment of a monomer sequence and a 3' segment 
of a monomer sequence that together comprise the sequence of a conipiete naonomer. 
The 5' segment is positioned 3' of the 3' segment, the 5 'terminus of the 3' segment is a 
5' restriction pair member, and the 3' terminus of the 5' segment is a 3' restriction pair 
member. In this case, in making a multimer cassette, ligation of the 3' restriction pair 

25 member of the 5' segment of one amplification cassette with the 5' restriction pair 
member of the 3' segment of another amplification cassette can form a complete 
monomer sequence. In order to complete the polymer sequences, a multimer assembly 
preferably comprises a 5 '-terminal cassette that comprises the 5' monomer segment and 
a 3 '-terminal cassette that comprises the 3' monomer segment. In this way, monomer 

30 sequences provided in the amplification cassettes can be provided in non-contiguous 

segments. In sonie preferred methods of the present invention, the amplification cassette 
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further comprises a linker that is positioned between the 5' segment and the 3' segment 
of the monomer sequence. 

In some methods of the present invention, restriction pair members can be 
overhang restriction sites. In some methods of the present invention, restriction pair 
5 members can be blunt end restriction sites. In some other methods of the present 

invention, restriction pair members are incompatible "overhang" restriction sites that are 
converted to blimt end restriction sites through the use of polymerases or nucleases. 

In some preferred methods of the present invention, restriction pair members are 
conveniently provided in one or more linker sequences. In these embodiments, linker 
10 sequences comprising a restriction pair member can be engineered onto the 3', 5', or both 
ends of an insert sequence. 

In some preferred methods of the present invention, the 3 '-restriction pair member 
codes for a stop codon that is destroyed upon ligation to the 5 '-restriction pair member. 

15 In one aspect of the present invention, the assembly methodology consists of the 

following four steps: 

1 . Generate or obtain the DNA for the monomer. 

Techniques familiar to those skilled in the art include, but are not limited to: 

a. Amplification of a sequence from a DNA library, optionally including any 
20 additions or mutations to the sequence in PGR primers. 

b. Chemical synthesis of the sequence 

c. Splicing of sequences together from pre-existing DNA 

2. Decide what linker sequence, if any, to use between monomers and construct a 
multimer assembly. 

25 Options for the linker include none (direct fusion of monomers), a linker 

encompassing a restriction pair member within its sequence, a linker with restriction 
pair members at one or more termini, or a linker lacking a restriction pair member. 
Once a linker is added, it becomes part of the monomer sequence. 

30 For each option, three basic cassettes can be generated: one or more 5'-terminal 
cassettes, at least one amplification cassette, and one or more 3' -terminal cassettes. 
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However, in some instances, all three cassettes are not required. A multimer assembly 
comprises at least one amplification cassette, and one or more 5 '-terminal cassettes or 
one or more 3 '-terminal cassettes, or can have at least one amplification cassette, one or 
more 5 '-terminal cassettes, and one or more 3 '-terminal cassettes. In some cases, 
5 multiple versions of each cassette may be desirable. Furthermore, the amplification 
cassette can be polymerized to produce new higher order (multimeric) amplification 
cassettes. 

The ends of the monomers determine the characteristics of the cassettes. The current 
invention discloses the use of linkers to introduce ends containing a restriction pair as 
10 well the construction of 5 '-terminal and/or 3 '-terminal cassettes to facihtate their use. 

As an alternative to engineering the ends of a monomer with a restriction pair, then 
the cassettes can be constructed with a restriction pair internal to the monomer sequence. 
The construction of the cassettes is modified to accommodate the presence of an 
noncontiguous monomer in each. 
1 5 Finally, a method is disclosed in which the constructions for a restriction pair either at 
the ends or intemal to the monomer is extended to use with a pair of incompatible 
restriction sites. This method is less preferred, as the method requires that blunt ends for 
ligation are created for each ligation step (by nuclease digestion or polymerase fill-in, or 
both), decreasing the efficiency of the procedure. 
20 The following are the general steps for construction of the assemblies for each 
. possible restriction pair case: 

a. Using a monomer sequence with a terminal restriction pair. 

The scheme shown in Figure 1 is applicable for any monomer sequence that can be 
engineered with a terminal restriction pair; The steps to engineer the assembly can 
25 include the following: 

(1) Engineer 5'-terminal cassettes containing one or more 5' specific DNA sequences 
(for example, start codon, secretion sequence, etc.), preferably the monomer sequence, 
linker sequence, if present, and the 3 ' member of the restriction pair. 

(2) Engineer an amplification cassette containing a 5' restriction member, optionally a 
30 first linker sequence, at least one monomer sequence, optionally a second linker 

sequence, and a 3' restriction member. 
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(3) Engineer 3 '-terminal cassettes containing a 5' restriction member, optionally a linker 
sequence, preferably the monomer sequence, and one or more 3 '-terminal specific DNA 
sequences (specific recognition sequences, stop codon, etc.). 

5 An alternative formulation involves 5 '-terminal and/or 3 '-terminal cassettes that 

do not include any monomer sequence. The utility of including the monomer sequence in 
both terminal cassettes lies in utilizing the restriction pair members to join each terminal 
cassette to an amplification cassette, however, this is not a requirement of the present 
invention. 

10 

b. Using a monomer sequence with an intemal restriction pair. 

The scheme shown in Figure 2 is applicable for any monomer sequence that can be 
engineered with an intemal restriction pair. The steps to engineer the assembly include 
the following: 

15 (1) Engineer 5'-terminal cassettes containing one or more 5' specific DNA sequences 
(start codon, secretion sequence, etc.), the portion of a monomer sequence that occurs on 
the 5' side of the restriction pair (the 5' monomer segment), and finally the 3' restriction 
pair member. 

(2) Engineer an amplification cassette containing a 5' restricfion pair member, DNA 
20 encoding the portion of a monomer sequence that occurs 3 ' of the restriction pair (the 3 ' 

monomer segment), optionally a linker sequence, DNA encoding the portion of a 
monomer that occurs 5' of the restriction pair (the 5' monomer segment), and a 3' 
restriction pair member. 

(3) Engineer 3 '-terminal cassettes containing the 5' restriction pair member, the portion 
25 of a monomer sequence that occurs 3' of the restriction pair (the 3' monomer segment), 

and one or more 3 '-terminal specific DNA sequences (specific recognition sequences, 
stop codon, etc.). 

c. Using a monomer sequence with a pair of incompatible restriction sites made 
30 compatible by blunt ending. 
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Either scheme shown in Figure 1 or Figure 2 are appHcable, but in this case the 
restriction pair consists of restriction sites that are blunt ended to make them compatible. 

Once constructed, the amplification cassette enables generation of a sequence 
containing any number of monomers fused together. 

5 

3. Polymerize the amplification cassette in an arithmetic, geometric, or mixed 
progression (see Figure 3). 

A series of amplification cassettes are generated from the original ampUfication 
cassette. The technique involves digesting a first construct comprising an amplification 

10 cassette at two 5' or two 3' sites of an insert, one of which is a restriction pair member 
site and the other of which is an extemal flanking site (external to the restriction pair 
member site), to open up the construct. This is followed by digesting a second construct 
comprising an amplification cassette at the same flanking site, but with the opposite 
restriction pair member, to release the amplification sequence fi-om the plasmid as a 

15 fi;agment. This sequence is then ligated into the opened first plasmid construct firom , 
before. Both restriction sites used in the ligation are destroyed, but the resulting cassette 
has intact flanking restriction sites and an intact restriction pair on the ends that enable 
further polymerizations. 

Mixing and matching the cassettes used to open a construct that comprises an 

20 amplification cassette and to generate an insert fironi a construct that comprises an 
mnplification cassette enables new cassettes of any size to be made in an arithmetic, 
geometric, or mixed progression. For example, if the mononier is used to both open the 
plasmid and create insert, a dimer cassette is made. If the resulting dimer is used for 
both, then a tetramer is made. If this tetramer is used for both, then an octamer is made, 

25 and continuation leads to a binomial geometric progression. On the other hand, if the 

monomer is always used as the insert and the newest cassette is used to receive the insert, 
an arithmetic progression of one is produced. For instance, when a dimer construct is 
opened and a monomer fragment inserted, then a trimer is produced. When a trimer 
construct is opened and a monomer fi-agment is inserted, then a tetramer is produced. In 

30 general, any new cassette can be mixed with any previously generated cassette to allow 
rapid generation of a polymer of any desired size. For example, if a polymer of size 20 is 
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desired, the 16mer is generated geometrically, and ligating the 16mer to the tetramer 
generates the 20mer in a total of only 5 ligations. 

Subsequent ligation to 5'- and 3 '-terminal cassettes can enable production of a 
functional multimer. The multimer's size, beised on actual molecular weight, is 
5 approximately a whole number multiple of the original. In addition, the composition of 
the multimer is almost identical to the monomer, differing only because of any linker 
sequences or terminal flanking regions that are used. 

It is important to note that the polymerization does not require flanking sites. 
Without flanking sites, the ligations can occur with the fragments joined in either 
10 orientation, and more laborious subsequent analysis is needed to identify the correct 
constructs. In contrast, use of flanking sites facilitates the process by enabling oriented 
hgations. 

4. Ligate the cassettes together to give a full length, functional, multimer. 

15 The cassettes can be ligated sequentially as shown in Figure 4, or an insertion 

cassette can be created from the 5'- and 3 '-terminal cassettes as diagramed in Figure 5 
with subsequent insertion of the polymerized amplification cassette as shown in Figures 
6, 7, and 8. The use of an insertion cassette expedites the creation of a series of . 
multimers with the same 5' and 3' terminal elements. Figure 6 illustrates a techiiique for 

20 the ligation of the fragment from an amplification cassette into an insertion cassette using 
only the restriction pair restriction sites. However, the ligation is not oriented, . 
necessitating additional analysis to identify correct constructs. Figures 7 and 8 show 
equivalerit oriented ligations that result from different arrangements of flanking 
sequences. 

25 Figure 4 illustrates a method of making a multimer cassette from two cassettes 

from a multimer assembly utilizing flanking sites comprising a first cassette comprising 
either a 5 '-restriction pair member or a 3 '-restriction pair member and a second cassette 
comprising both a 5 '-restriction pair member and a 3 '-restriction pair member and further 
comprising: 

30 



25 



1) providing the first cassette with a first flanking restriction site at one 
end, either 5' or 3 ', of its insert sequence; 

2) providing the second cassette with a second flanking restriction site that 
is, or is made, hgation compatible with the first flanking site and is on the 

5 sanie side, either 5' or 3', of its insert sequence as the first flanking 

restriction site is relative to the first cassette's insert sequence; 

3) digesting the first cassette at its restriction pair member and the first 
flanking site and isolating the first fi-agment containing the insert 
sequence; 

10 , 4) digesting the second cassette at its restriction pair member partner to the 

first cassette's restriction pair member and at the second flanking site and 
isolating the second fi-agment containing the insert sequence; 
5) ligating the first fi-agment with the second fi-agment to generate a 
multimer cassette. 

15 

The identities of the first and second cassettes can vary. For example, the first 
cassette can be a 3 '-terminal cassette and the second cassette an amplification cassette, 
the first cassette can be a 5 '-terminal cassette and the second cassette an amplification 
cassette, the first cassette can be a 3 '-terminal cassette and the second cassette a multimer 

20 cassette constructed from a 5 '-terminal cassette and an amplification cassette, or the first 
cassette can be a 5 '-terminal cassette and the second cassette a multimer cassette 
constructed from a 3 '-terminal cassette and an amplification cassette. 

For the case when the first cassette is a 3 '-terminal cassette and the second 
cassette is an aniplification cassette, if the amplification cassette is digested at its 3' 

25 restriction pair member and a flanking restriction site on the 5' side of its 5' restriction 
member to generate a ligatable fragment, then the 3 '-terminal cassette is digested at its 5' 
restriction pair member and a flanking restriction site on the 5' side of this member to 
generate a ligatable cassette. Altematively, if the amplification cassette is digested at its 
3' restriction pair member and a flanking restriction site on the 3' side of this member to 

30 generate a ligatable cassette, then the 3 '-terminal cassette is digested at its 5' restriction 
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pair member and a flanking restriction site on the 3' side of its complete insert to generate 
a ligatable fragment. 

It is important to note that the hgation of cassettes together does not require 
flanking sites. However, flanking sites enable oriented ligations. For example, if 
5 flanking sites are absent, a method of making a multimer cassette from two cassettes 
from a multimer assembly comprising a first cassette comprising either a 5 '-restriction 
pair member or a 3 '-restriction pair member and a second cassette comprising both a 5'- 
restriction pair member and a 3 '-restriction pair member comprises: 

10 1) digesting the first cassette at its restriction pair member and isolating 

the first fragment containing the insert sequence; 

2) digesting the second cassette at both its restriction pair member sites 
and isolating the second fragment containing the insert sequence; 

3) ligating the first fragment with the second fragment and screening for 
15 correct hgation orientation to gener iate a multimer cassette. 

Again, the identities of the first and second cassettes can vary. The first cassette 
can be a 3 '-terminal cassette and the second cassette an amplification cassette, the first 
cassette can be a 5 '-terminal cassette and the second cassette an ampUfication cassette, 

20 the first cassette can be a 3 '-terminal cassette and the second cassette a multimer cassette 
constructed from a 5 '-terminal cassette and an amplification cassette, or the first cassette 
can be a 5 '-terminal cassette and the second cassette a multimer cassette constnicted from 
a 3 '-terminal cassette and an amplification cassette. 

Figure 5 illustrates a method of making an insertion cassette from the 5 '-terminal 

25 cassette and the 3 '-terminal cassette when each shares an insertion restriction site. The 
method comprises: 

1) providing the 5 '-terminal cassette with a first flanking restriction site, 
independent of the insertion restriction site, that is outside of the sequence 
30 including the insert sequence and insertion restriction site of the 5'- 

terminal cassette; 
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2) providing the 3 '-terminal cassette with a second flanking restriction 
site, independent of the insertion restriction site, that is outside of the 
sequence including the insert sequence and insertion restriction site of the 
3 '-terminal cassette and is , or is made, ligation compatible with the first 
flanking site and is on the same side, either 5' or 3', of its insert sequence 
as the first flanking restriction site is relative to the 5 '-terminal cassette's 
insert sequence; 

3) digesting the 5 '-terminal cassette at its insertion restriction site and the 
first flanking site and isolating the first fragment containing the insert 
sequence; 

4) digesting the 3 '-terminal cassette at its insertion restriction site and the 
second flanking site and isolating the second fragment containing the 
insert sequence; 

5) ligating the first fragment with the second fragment to generate an 
insertion cassette. 

Figure 6 illustrates a method of making a multimer cassette comprising an insertion 
cassette and an amplification cassette from a multimer assembly comprising: 

20 1) digesting the insertion cassette at both its restriction pair member sites 

and isolating the first fragment containing the insert sequence; 

2) digesting the amplification cassette at both its restriction pair meniber 
sites and isolating the second fragment containing the insert sequence; 

3) ligating the first fragment with the second fragment and screening for 
25 correct ligation orientation to generate a multimer cassette. 

Figures 7 and 8 illustrate a method of making a multimer cassette comprising an 
insertion cassette and an amplification cassette comprising: 

30 1) digesting the amplification cassette at the insertion restriction site and 

its restriction pair member on the opposite side, either 5' or 3', of the 
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insert sequence and isolating the first fragment containing the insert 
sequence; 

2) digesting the insertion cassette at the insertion restriction site and the 
restriction pair member partner to the digested amplification cassette's 

5 restriction pair member and isolating the second firagment containing the 

insert sequence; 

3) ligating the first fragment with the second fi-agment to generate a 
multimer cassette precursor; 

4) digesting the multimer cassette precursor at both restriction pair 
10 members, isolating the fragment containing the insert sequence, and 

Ugating it with itself to generate a multimer cassette. 

Once constructed, the gene for the multimer can be used as an insert to construct 
other cassettes or to express it in a suitable transcription and translation system. Once 

15 isolated in the correct conformation and with the necessary degree of purity, polymeric 
polypeptides are available for applications in the fields of medicine, veterinary care, 
research and development, diagnostics, etc. The present invention comprises proteins 
made from multimer assemblies of the present invention. 

Each cassette can involve a fusion of any of a number of functional elements. For 

20 example, any construction involving a linker is by nature a heteromultimer, because the 
monomer contains at least two functional elements. A particularly expeditious method to 
produce these fusions is to treat each functional element as a nested assembly. In other 
words, each element itself is an assembly that consists of individual cassettes. 

The current methods are easily extended to heteromultiniers if two sequences 

25 share compatible restriction sites. For instance, two distinct monomer amplification 
cassettes, A and B, can be ligated together if they share the same restriction pair. 
Subsequent polymerization of this new "monomer" results in an alternating sequence, 
ABAB. . . Any pattem of alternating sequences can theoretically be constructed fi-oni any 
number of initial monomers. For example, the pattem ABBCABBC. . . is just one 

30 possibility. 
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II MULTIMER ASSEMBLIES AND MULTIMER CASSETTES 



The present invention includes multinier assemblies made using the methods of the 
present invention and novel cassettes incorporating novel restriction pair members. In 
5 some preferred aspects of the present invention, a multimer assembly of the present 
invention comprises two or more amplification cassettes, in which fused 5' and 3' 
restriction pair member sites join the amplification cassettes. An amplification cassette 
can comprise any practical number of monomer sequences. 

Multimer assemblies of the present invention comprise component constructs 
10 , having 5' restriction pair members, 3' restriction pair members, or both 5' restriction pair 
members and 3' restriction pair members that can be used to make multimer cassettes, 
including multimer expression cassettes. Such cassettes are synthesized by joining 
component cassettes (such as 5 '-terminal cassettes, 3 '-terminal cassettes, and 
ampUfication cassettes) by ligating a 3' restriction pair member site of one component 
1 5 cassette to a 5' restriction pair member site of another component cassette. 

One multimer assembly of the present invention comprises one or more amplification 
cassettes and at least one 3 '-terminal cassette. Another multimer assembly of the present 
invention comprises one or more amplification cassettes and at least one 5 '-terminal 
cassette. Another multimer assembly of the invention comprises one or more 
20 amplification cassettes, at least one 3 '-terminal cassette, and at least one 5 '-terminal 
cassette. 

Multimer expression cassettes made from multimer assemblies of the present 
invention include, for example, multimer cassettes in which a 5 '-terminal cassette is 
fused to an amplification cassette comprising a single monomer, multimer cassettes in 

25 which a 5 '-terminal cassette is fused to a multimer amplification cassette constructed 
from multiple amplification cassettes, and multimer cassettes in which a 5 '-terminal 
cassette is fused to a multimer cassette comprising one or more amplification cassettes 
and at least one 3 '-terminal cassette. Multimer expression cassettes made from multimer 
assemblies of the present invention also include, for example, multimer cassettes in 

30 which a 3 '-terminal cassette is fused to an amplification cassette, multimer cassettes in 
which a 3 '-terminal cassette is fused to a multimer amplification cassette constructed 
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from multiple amplification cassettes, and multimer cassettes in which a 3 '-terminal 
cassette is fused to a multimer cassette comprising one or more amplification cassettes 
and at least one 5 '-terminal cassette. 

The present invention also includes novel amplification cassettes. In one aspect of 
5 the present invention, an amplification cassette comprises at least one linker, in which at 
least one of the one or more linkers comprises at least one restriction pair partner. 
Amplification cassettes can be fiised using restriction pair partners, at least one of which 
is introduced in the linker, to form a multimer amplification cassette. The method of 
making the multimer amplification cassette is by joining two or more amplification 

10 cassettes by ligating the first restriction pair partner of at least one of the two or more 
ampUfication cassettes to the second restriction pair partner of at least one other of the 
two or more amplification cassettes to generate a multimer cassette. The present 
invention includes multimer ampHfication cassettes comprising component amplification 
cassettes that incorporate linkers, and multimer assemblies and muhimer expression 

1 5 cassettes that include such multimer amplification cassettes. 

Also included as amplification cassettes of the present invention are amplification 
cassettes that comprise monomer sequences in noncontiguous orientation. For example, 
an amplification cassette can comprise a 5' segment of a monomer sequence and a 3' 

20 segment of a monomer sequence that together comprise the sequence of a complete 

nionomer, in which the 5' segment is positioned 3' of the 3' monomer segment. In these 
embodiments, the 5 'terminus of the 3' monomer segment is preferably a 5' restriction 
pair member and the 3' terminus of the 5' monomer segment is preferably a 3' restriction 
pair member. The present invention also includes multimer amplification cassettes 

25 comprising two or more amplification cassettes that comprise monomer sequence in 
noncontiguous orientation. Such multimer cassettes comprising multiple amplification 
cassettes can be made by ligating a 3' restriction member of at least one of the two or 
more amplification cassettes to a 5' restriction member of at least one other of the two or 
more amplification cassettes. The present invention also includes multimer assemblies 

30 and multimer expression cassettes that include such amplification and multimer 
amplification cassettes. 
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In yet another aspect, the present invention includes amphfication cassettes that 
comprise 3' and 5' restriction pair members comprising restriction sites that are initially 
ligation incompatible but are blunt ended to make them ligation compatible. The present 
invention also includes multimer aniplification cassettes comprising two or more 
5 amplification cassettes that comprise noncompatible sites that have been blunt^ended and 
then ligated to join the two or more amplification cassettes. The present invention also 
includes multimer assemblies and multimer expression cassettes that include such 
aihplification and multimer amplification cassettes. 

10 The invention includes multimer assembly cassettes in vectors, including cloning and 

expression vectors, where expression vectors can be designed for in vitro or in vivo 
expression. The vectors can be designed for in vivo expression in prokaryotes or 
eukaryotes, including but not limited to, bacterial cells, fungal cells, algal cells, plant 
cells, insect cells, avian cells, and mammahan cells. The present invention also 

15 encompasses cells that include such vectors and polymeric proteins made using vectors 
that comprise multimeric expression vectors of the present invention. The present 
invention also encompasses polymeric proteins expressed from the multimeric assemblies 
of the present invention. 

The disclosed invention also encompasses the construction of different multimer 

20 assemblies involving multimeric hGH, and multimer cassettes made using the methods of 
the present invention that comprise multimerized hGH sequences or multimerized 
portions of hGH. Sequences encoding hGH or portions thereof that are part of multimer 
cassettes and miiltimer assemblies of the present invention include sequences that encode 
hGH taking into account the redundancy of the genetic code. Sequences encoding hGH or 

25 portions thereof that are part of multimer cassettes and multimer assembhes of the 
present invention include sequences that encode hGH can also comprise sequence 
changes with respect to the human GH gene sequence that change the amino acid 
sequence where such changes do not detrimentally affect the activity of the protein or 
portion thereof. 

30 The hGH assemblies can differ in the functional elements included, such as those 

provided by 3'- or 5 '-terminal elements. The ease of producing these assemblies, and the 
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resulting multimers and polymers, demonstrates the utility of the methods disclosed. In 
the examples below, restriction sites outside, and flanking, the restriction pair sites are 
engineered in order to facilitate the manipulation of the cassettes. 

Endogenous hGH appears in several forms in vivo as a result of expression from more 
5 than one gene, as well as alternative gene splicing. The predominant mature form of 
hGH is a single polypeptide chain consisting of 191 amino acids. The DNA and protein 
sequences for this predominant form are given as SEQ ID NO: 1 and SEQ ID NO: 2, 
respectively. 

In the following paragraphs, the term "engineer" refers to using standard techniques 
10 of molecular biology generally known to those skilled in the art. Standard techniques 

include, but are not restricted to, restriction digestion and ligation, PGR amplification and 
mutagenesis, DNA synthesis, DNA isolation and purification, etc., as described in 
Sambrook et al. (2000), which are hereby incorporated by reference. As such, the details 
are only described if they bear directly on the present invention or deviate from common 
15 practice. 

Examples 

A drawback to rhGH therapy is the need for once daily injections. Understandably, 
20 patient preference is for a minimum of injections. In an attempt to overcome this, rhGH 
has been formulated with PLGA in microspheres, chemically linked to PEG, and fiised to 
HSA in order to produce longer acting versions. Here we describe the construction of 
families of multimeric rhGHs, according to the steps below using the general procedures 
shown in Figures 1 to 8. 

25 

Example 1 

The first example involves isolation of the GH gene. Steps to isolate the hGH 
gene are summarized in Figure 9. hGH is highly expressed in the anterior pituitary 
gland. As a result, mRNA of hGH is abundantly found in lysates of human pituitary. 
30 The gene for hGH is PGR amplified from human pituitary cDNA (Human Pituitary 

Gland Quick-CloneTM cDNA, BD Biosciences Clontech, Palo Alto, CA, catalog #7173- 
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1) using SEQ ID NO: 3 as the 5' primer and SEQ ID NO: 4 as the 3' primer. The 5' 
primer has an Ndel restriction enzyme site coding for an N-terminal methionine, and the 
3' primer has a BamHI restriction enzyme site immediately after the TAG stop codon. 
The resulting PCR fragment is isolated from the reaction mix using standard techniques, 
5 as are all subsequent ones. 

The purified PCR fragment is ligated into parent plasmid pET41 a (Novagen, 
Madison, WI) after both insert and plasmid are digested with Ndel and BamHI and 
purified, again using standard techniques. This plasmid hgation mixture, and all others 
unless otherwise indicated, is transformed into DH5a cells and plated on LB/antibiotic 

10 plates. Single colonies are sub-cultured and plasmid DNA is isolated from each. 

Restriction enzyme analysis is used to confirm the presence of an insert into the plasmid, 
and plasmids with insert are sent for DNA sequencing using SEQ ID NO: 5 and SEQ ID 
NO: 6 (Novagen, Madison, WI) as amplificatioii primers for the 5* and 3' ends, 
respectively. Plasmid with correct insert is identified as pO AO, and the DNA coding 

15 region and corresponding open reading frame (ORF) translation are hsted in SEQ ID NO: 
7 and SEQ ID NO: 8, respectively. The convention for the sequences is that the 
restriction sites are included at the termini of DNA sequences and only translated amino 
acids that eventually appear in an expressed insert are given. Expression of protein from 
pOAO yields a 192 amino acid protein consisting of full length hGH with an additional N- 

20 terminal methionine. 

It is convenient to engineer a high copy number plasmid that contains the hGH 
gene and enables digestion of the hGH gene in its interior so that 5' or 3' elements can be 
swapped in and out. The gene for hGH contains a convenient PstI site, CTGCAG. The 
plasmid p04 (SEQ ID NO: 9), a derivative of pUC19 (New England Biolabs) containing 

25 the same multi-cloning site as pET41a, is first readied by digesting with PstI, followed by 
Mung Bean Nuclease, and subsequent re-ligation to destroy the internal PstI site to create 
p04Al . Finally, the Ndel/BamHI hGH fragment from pOAO is ligated into similarly 
digested p04Al to yield pOA03. 

Several examples are now given to generate assemblies for GH multimers with 

30 different linkers. Variation in the linker sequence, as well as the degree of monomer 
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polymerization, may alter the polymers ease of production, conformation, in vitro 
activity, in vivo activity, imniunogenicity, etc. 

Example 2 

5 The second example involves generation of an assembly for the direct fusion multimer of 
GH. 

There is not a convenient restriction pair at the termini of rhGH, so this example 
uses the methods for a monomer sequence with an internal restriction pair. A direct 
fusion assembly for hOH is constructed with the features diagrammed in Figure 2. 

10 Disclosed are two 5'-terminal cassettes, the amplification cassettes, and two 3'-terminal 
cassettes. The 3 '-terminal cassette is engineered to enable construction of an insertion 
cassette, as shown in Figure 5. This facilitates insertion of amplification cassettejs to 
generate expressible genes for different size homopolymeric GHs. 

Two 5 '-terminal cassettes for the GH fusion protein assembly are disclosed. The 

15 first is a direct start 5 '-terminal cassette, and the second is an OmpA start 5 '-terminal 
cassette. The direct start results in an N-terminal methionine at the N-terminus of the 
final expressed GH polymer. Its construction is straight forward because the insert in 
pOAO and pOA03 already has the N-terminal methionine fused to the GH gene. In 
contrast, the OmpA start codes for an N-terminal leader sequence that targets the polymer 

20 to the periplasmic space of E. coli, resulting in the cleavage of the leader from the 
polymer. There are many other 5 '-terminal cassettes that can easily be generated by 
those skilled in the art. 

A pre- 5 '-terminal cassette is disclosed that enables fusion of the OmpA sequence 
to any other blunt end or Hindlll digested sequence. SEQ ID NO: 10 is a synthetic DNA 

25 fragment that contains the coding sequence for the OmpA leader peptide, and its ORF 
translation is listed in SEQ ID NO: 11. The fragment has a 5' Ndel site, the OmpA 
leader coding region, a 3' Hindlll site for Hindlll ligation or blunt end ligation after 
filling in the Hindlll 5' overhang with T4 DNA polymerase, and a BamHI site for 
cloning flexibility. Plasmid p04 is readied by digestion to destroy an iiitemal site, this 

30' time the Hindlll site. The plasmid is digested with Hindlll, followed by Mung Bean 
Nuclease, and subsequently ligated back together to create p04A2. Both p04A2 and 
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insert DNA are digested with Ndel and BamHI and ligated together to yield the plasmid 
p0C0A2 as shown in Figure 9. 

For the current use, a GH sequence is needed that contains a 5- blunt end or 
Hindlll site, along with a 3' restriction site that is the 3' member of a restriction pair. 
5 The 5' terminus is engineered with a Hindlll site. Digestion with Mung Bean Nuclease 
after digestion with Hindlll results in a blunt 5' end that leaves the 5 '-terminal codon of 
GH, TTG, intact. Although the blunt end is not needed for the current example, in 
general it is necessary for ligation to other hypothetical cassettes. 

There are several choices for the restriction site pair, and we choose to use GH 
10 amino acids 187 and 188, glycine and serine, that are compatible with, among other 
enzymes, BamHI and Bell. The two enzymes recognize sequences GGATCC and 
TGATCA, respectively. BamHI is assigned as the 3' member, and Bell is assigned as the 
5' member. 

The desired DNA sequence is generated by PGR using pOA03 as template, as 
15 shown in Figure 10. The 5' and 3' primers are listed in SEQ ID NO: 12 and SEQ ID 
NO: 13, respectively, and the DNA coding region for the insert between the 5' flanking 
Ndel and 3' BamHI sites is listed in SEQ ID NO: 14. The fragment is digested with 
Hindlll and BamHI and inserted into similarly digested p04Bl to yield pOAOl, Plasmid 
p04Bl is prepared by destroying the Hindlll site in p04Al as described for the 
20 preparation of p04A2. The result is a parent plasmid with the PstI and Hindlll sites 
destroyed. 

The 5 '-terminal cassettes are now constructed froni the generated sequences as 
shown in Figure 10. The Xbal/Hindlll fragment from p0C0A2 is inserted into plasmid 
pOAOl to generate pOAl 1 A2. The result is the OmpA 5- -terminal cassette for the GH 

25 direct fusion assembly. It contains the OmpA sequence fused directly to the 5' coding 
region of GH. The resulting DNA insert between Ndel and BamHI is listed in SEQ ID 
NO: 15, with corresponding ORE listed in SEQ ID NO: 16. The direct translation start 
5 '-terminal cassette is constructed by ligating fragments from existing sequences. The 
Pstl/BamHI 5' GH fragment and plasmid backbone that results from digesting pOA03 is 

30 ligated with the Pstl/BamHI 3' GH fragment that results from digesting pOAOl to yield 
pOAl 1 Al . The resuhing DNA sequence between Ndel and BamHI, and the 
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corresponding ORF, for pOAl 1 Al are listed in SEQ ID NO: 17 and SEQ ID NO: 18, 
respectively. 

As shown in Figure 2, the amplification cassette must contain several 
components. First, it must have both the 5' and 3' members of the restriction pair to 
5 enable polymerization. In between must be the entire continuous GH sequence. Finally, 
if convenient, there should be flanking restriction sites for insertion and extraction of the 
sequence from a plasmid backbone. 

The amplification cassette for the current direct fusion of GH is generated by 
PGR, as shown in Figure 11. The 5' primer is hsted in SEQ ID NO: 19. It contains an 

10 Ndel site, the 5' restriction pair member Bell, followed by the codons that together code 
for GH amino acids 187-191, and finally codons to anneal to the GH 5 '-terminal codons. 
The 3' primer is one previously used arid listed in SEQ ID NO: 13. The PGR template is 
pOA03. The resulting insert DNA sequeiice between Ndel and BamHI is listed in SEQ 
ID NO: 20, with ORF sequence listed in SEQ ID NO: 21. The DNA sequence is inserted 

1 5 into plasmid p04Al to yield pOAl IB. 

Two simple 3 '-terminal cassettes are disclosed, as shown in Figure 12. Both 
code for the 3' terminus of GH, starting at the glycine and serine codons within the Bell 
site, amino acids 187 and 188, and ending with the translation stop codon, TAG. The 
first cassette, given in SEQ ID NO. 22, is a direct translation stop. The double stranded 

20 DNA is synthesized and contains an EcoRI site flanking the 5' terminus, a Bell site to 
ligate to BamHI, the 3' terminus of GH, a stop codon, and a Sail site for cloning 
flexibility. It is inserted into p04Al by digesting the synthetic DNA and p04Al with 
EcoRI and Sail and ligating the large fragments together to yield plasmid pOAl ICl . The 
C-terminal ORF protein sequence contributed by this cassette to subsequent GH multimer 

25 constructs is given in SEQ ID NO: 23. 

The second 3 '-terminal cassette, given in SEQ ID NO: 24, is a synthetic DNA 
fragment similar to the first, except it contains the codons for a 3 amino acid polylysine 
tail before the stop codon. It is analogously inserted into p04Al to yield plasmid 
pOAl 1C2. The polylysine tail is potentially usefiil for chemical conjugation with other 

30 molecules. SEQ ID NO: 25 is the C-terminal ORF sequence contributed by the new 
insert to subsequent GH multimer constructs. 
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Once the basic cassettes are complete, the amphfication cassette can be 
polymerized, the 5 '-terminal and 3 '-terminal cassettes can be joined to form an insertion 
cassette, and finally amplification cassettes can be ligated to the insertion cassette to 
generate expressible multimers. 

5 

Examples 

The polymerization of the GH direct fusion amplification cassettes is performed 
as shown in general in Figure 3 and specifically in Figure 13. The first polymerization 
is formation of the dimer. Plasmid pOAl IB is digested with Ndel and Bell and the 

10 plasmid isolated. In a separate reaction, pOAl IB is digested with Ndel/BamHI and the 
insert isolated. The two fragments are then ligated together to yield plasmid pOAl 1B2. 
Its insert DNA sequence is listed in SEQ ID NO: 26, and the corresponding ORE 
translation is listed in SEQ ID NO: 27. This process is repeated, changing the identity, 
and thus the size, of amplification cassettes 1 and 2 in Figure 13 to construct polymer 

15 inserts of different sizes. The size of new constructs is increased fastest if the 

polymerization is done geometrically, each time using the most recent construct for both 
cassettes 1 and 2. The size is increased by one if the monomer amplification cassette, 
pOAl IB, is used either as cassette 1 or 2. The generalized sequences for the resulting 
amplification cassettes are given in SEQ ID NO: 28 and SEQ ID NO: 29 for the DNA 

20 and protein, respectively. 

Example 4 

The cassettes for the GH direct fiision assembly are designed to enable construction of 
insertion cassettes to facilitate generation of a variety of expressible polymers. The 

25 general procedures are shown in Figures 5 and 7 and the specifics in Figure 14. 
Different insertion cassettes can be generated with the various 5 '-terminal and 3'- 
terminal cassettes. However, only the one involving pOAl 1 Al and pOAl ICl is 
described here. Others are constructed in exactly the same way. 

Plasmid pOAl 1 Al is digested with EcoRI and Sail and the opened plasmid is 

30 isolated. Plasmid pOAl ICl is digested with the same enzyme pair and the insert isolated. 
The two fi-agments are ligated together to generate the insertion cassette, pOAllD, and 



38 



the resulting DNA sequence is listed in SEQ ID NO: 30. Plasmid pOAl ID is compatible 
with ligation of any of the amplification cassettes for this assembly. It need be prepared 
only once for all subsequent ligations, as long as the supply is sufficient. 

5 Example 5 

Either of the two schemes shown in Figures 6 and 7 can be used to ligate 
amplification cassettes into the insertion cassette. The example given here utilizes the 
oriented ligation shown in Figure 7 and subsequent digestion and re-ligation to generate 
final products as shown in Figure 14. 

10 . Plasmid pOAl ID is digested with BamHI and EcoRI, and the plasmid is isolated. 

An amplification cassette is digested with Bell and EcoRI and the insert isolated. 
Ligation of the two firagments yields an intermediate that is converted to the multinier 
expression cassette after digestion with BamHI and Bell, purification, and subsequent re- 
ligation. The result is an expression ready insert for the direct fusion growth hormone 

15 multimer. Whenperformed with the Nmer amplification cassette, the result is an N+1 
multimer expression cassette. The insert has general DNA sequence listed in SEQ ID 
NO: 31 and corresponding ORF translation listed in SEQ ID NO: 32. The production of 
different size multimers is controlled by the size of the ligated amplification cassette. 

Protein expression is achieved by digesting and ligating the multimer expression 

20 cassette insert into an appropriate expression system. For example, the insert can be 
liberated with Ndel and Sail and ligated into similarly digested pET41a, followed by 
transformation into E, coli strain BL21(DE3) (Novagen). 

One utility of the . invention is the ease of production of different size miiltimers 
and different variations once the basic cassettes, pOAl 1 Al , pOAl 1 A2, pOAl IB, 

25 pOAllCl, and pOAllC2, for example, are constructed. Those skilled in the art can easily 
see how substituting pOAl 1C2 for pOAl ICl when generating the insertion cassette 
generateis a polylysine tail variant. 

Example 6 

30 The next exarnple involves generation of a GH multimer with a linker without a 

convenient restriction pair. The one amino acid linker, glycine, is used as an example. 
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The construction of GH multimers with a glycine linker is analogous to the construction 
of the fusion protein. In fact, the GH glycine linker assembly shares the same 5'- and 3'- 
terminal cassettes with the GH fusion protein assembly. This is one advantage of the 
assembly construction scheme given in Figure 2, Assemblies differing only in the linker 
5 region only need different amplification cassettes, while sharing the same 5*- and 3'- 
terminal cassettes. 

Use pOAl 1 Al and pOAl 1 A2 as before for the direct start and OmpA S'-terminal 
cassettes for the direct fusion assembly. Use pOAl ICl and pOAl 1C2 as before for direct 
stop and poly lysine 3 '-terminal cassettes. 

10 The only difference is the amplification cassettes that contain a glycine codon 

between the ending and starting codons for GH. The glycine linker amplification cassette 
is made in the same way as the one for the direct fusion homomultimer except for some 
necessary substitutions of sequences, as shown in Figure 15. SEQ ID NO: 33 is 
substituted for SEQ ID NO: 19 as the 5' PGR primer. It contains the same elements as 

15 before, as well as the glycine codon between the sequence for amino acids 191 and 1 . 
The resulting PGR fi*agment is inserted into parent plasmid p04Al by digesting both the 
parent plasmid and the PGR fi-agnient with Ndel and BamHI and ligating the appropriate 
fi-agments together. The resulting plasniid is labeled pOA2 IB. The DNA sequence and 
ORF translation for the insert sequence between Ndel and BamHI are listed in SEQ ID 

20 NO: 34 and SEQ ID NO: 35, respectively. 

The construction of additional amphfication assemblies, the insertion cassette, 
and multimer expression cassettes for the GH glycine linker assembly is identical in 
practice to the one for the GH direct fiision assembly, Figures 13 and 14, except for the 
substitution of pOA21B for pOAl IB. The corresponding generalized amplification 

25 cassette insert DNA and ORF sequences are listed in SEQ ID NO: 36 and SEQ ID NO: 
37, and the general formulas for the multimer expression cassettes are listed in SEQ ID 
NO: 38 and SEQ ID NO: 39. 

The previous examples have demonstrated, among other things, the ease at which 
multiple 5'- and 3'-temiinal cassettes can be used to introduce variations in the N- and C- 

30 termini of a polymer. In the case of the 5 '-terminal cassettes, cassettes with either a 

direct translation start or one introducing a leader sequence are disclosed. In the case of 
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the 3 '-terminal cassettes, ones with either a direct stop or one introducing a polylysine 
tail are disclosed. Each demonstrates the ease at which functional elements can be added 
to the beginning or end of a polymer sequence. These methods are easily extended to 
other examples by those skilled in the art. Therefore, subsequent examples will be 
5 limited to the presentation of only a single 5'- and 3 '-terminal cassette for each assembly. 
The next examples involve generation of GH multimers utilizing linkers that 
result in monomers with a terminal restriction pair. Figure 1 details the general features 
for these assemblies. 

10 Example 7 

This example involves a linker that is noteworthy because it contains a 3' 
restriction pair member with a functional stop codon that is destroyed upon 
polymerization. Use of this linker makes it possible to express functional multimers 
using just the 5 '-terminal and amplificiation cassettes. However, a 3 '-terminal cassette is 

1 5 necessary to express homomultimers without any residual linker at the 3 ' terminus of the 
protein. 

The 5 ' restriction pair member is Ncol, C'^CATGG, while the 3 ' restriction pair 
member is Real, T^CATGA. Therefore, the resulting linker sequence is A-Ser-Trp-B, 
where A and B are arbitrary protein sequences. For the given example, A is a null 

20 sequence, and B is G4S, where the single letter amino acid abbreviations are used. 

For this example, only one 5 '-terminal cassette is disclosed, with a direct ATG 
start codon and no leader sequence, as shown in Figure 16. The PGR primers for the 5'- 
terminal cassette are listed in SEQ ID NO: 3 and SEQ ID NO: 40, for the 5' and 3' ends, 
respectively. The 5' primer maintains the Ndel site and its start codon, while the 3' 

25 primer introduces a stop codon within an Real (or BspHI) restriction site, immediately 
followed by a BamHI site. The template for the reaction is pOAO. 

Because the Real restriction site also contains the codon TCA immediately 5' of 
the stop codon, it also introduces a C-terminal serine residue. The resulting PGR 
fragment is purified and ligated into pET41a in an analogous manner for the generation 

30 of pOAO. The sequence verified plasmid is labeled pOA3 1 A, and the DNA coding region, 
from the Ndel to the BamHI site, and the resulting ORF protein sequence are listed in 



41 



SEQ ID NO: 41 and SEQ ID NO: 42, respectively. Expression of protein from the gene 
for pOA31 A yields a 193 amino acid protein consisting of full length hGH with an 
additional N-terminal methionine and C-terminal serine. 

The PGR primers for the amplification cassette are listed in SEQ ID NO: 43 and 
5 SEQ ID NO: 40, for the 5' and 3' ends, respectively. The 5' primer introduces an Ncol 
site followed by the linker region. The Ncol site is ligation compatible with the 3' Real 
site, and any such ligation destroys the TGA stop codon by altering it to a TGG codon. 
The resulting PGR fragment is purified and ligated into pET41a after the PGR product 
and plasmid are cut with Ncol and BamHI, as shown in Figure 16. The sequence 

10 verified plasmid is labeled pOA3 IB, and the DNA coding region frorn the Ncol to the 

BamHI site is listed in SEQ ID NO: 44. The ORE protein sequence coded by the insert is 
given in SEQ ID NO: 45. 

Again, for this example, only one 3 '-terminal cassette is disclosed, with a direct 
T AG-stop cgdon and no other 3 '-specific sequences. The 3 '-terminal cassette is 

1 5 constructed using PGR with pOAO as template and SEQ ID NO: 43 and SEQ ID NO: 4 as 
5' and 3' primers, respectively. This creates a cassette with a 5' linker and a 3' stop 
codon irnmediately following the last amino acid from the parent monomer. The PGR 
fragment is inserted into pET41a as before and shown in Figure 16 to create pOA31G. 
The resulting DNA and protein fi-agments between the Ndel and BamHI sites are listed in 

20 SEQ ID NO: 46 snd SEQ ID NO: 47, respectively. 

The scheme for the polymerization of the amplification cassettes is shown in 
Figure 3. Additional care is necessary because the parent plasmid contains Real sites. 
One way to unambiguously liberate the insert sequence for polymerization is to first 
digest the flanking BamHI site, isolate the insert, and then digest with Real. The general 

25 formulas for the Nmer amplification cassette are listed in SEQ ID NO: 48 and SEQ ID 
NO: 49 for the DNA and corresponding ORE translation, respectively. 

Example 8 

The ligation of the multimer assembly cassettes must be done sequentially, as 
30 shown in Figure 4, because the arrangement of the restriction sites in the 3 '-terminal 
cassette is like Figure 2d. The first ligation involves the 5 '-terminal and amplification 
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cassettes, rather than the 3 '-terminal and ampHfication cassettes, to take advantage of the 
stop codon in the 3 '-restriction member to produce expression ready inserts. The 
specifics are shown in Figure 17 using procedures already described. Use of the 
mdnomeric amplification cassette, pOA3 IB, results in the dimeric cassette, pOA3 1F2, 
5 with insert DNA and corresponding ORP translation listed in SEQ ID NO: 50 and SEQ 
ID NO: 5 1 . The general formulas for the N+lmer produced after ligation between the 
Nmer amplification and the 5 '-terminal cassettes are listed in SEQ ID NO: 52 and SEQ 
ID NO: 53. Transfer of the insert into an appropriate expression system yields expression 
of the N+1 GH polymer with the SWG4S Hnker and C-terminal S residue. 

10 Completion of the hgation scheme shown in Figure 17 results in an insert with an 

additional monomer and the natural C-terminus of GH. If the insert from pOA3 1F2 is 
hgated into pOA31C, then the trimer expression cassette pOA31E3 is generated. In 
general, the formulas for the insert DNA and corresponding ORF translation when the 
Nmer amphfication cassette is used are listed in SEQ ID NO: 54 and SEQ ID NO: 55. 

1 5 For pOA3 1 E3 , the monomer amplification cassette is used and N= 1 . 

" Example 9 

The plasmids containing the inserts generated with the ligation scheme shown in 
Figure 17 are capable of expressing rhGH polymers following standard techniques (see 

20 for example, user manuals from Novagen, Madison, WI). DNA sequences listed in SEQ 
ID NO: 52 with N = 0, 1, 2, 4, and 8 and prepared according to Example 8 are ligated into 
pET41a. The resulting plasmids are separately transformed into BL21(DE3) and 
separately grown in Luria Browth medium and induced to express the polymer protein by 
adding IPTG to a concentration of 1 mM. 

25 Following 3 hours of induction, each culture is harvested by centrifiigation and 

treated with SDS-PAGE sample buffer. Proteins from the samples for each culture are 
separated according to their molecular weights on a standard SDS-PAGE gel (Invitrogen, 
Carlsbad, CA). The resulting gel is stained with coomasie blue stain to visualize the 
protein bands Results for the monomer (SEQ ID NO: 42), dimer (N=l in SEQ ID NO: 

30 53), trimer (N=2 in SEQ ID NO: 53), pentamer (N=4 in SEQ ID NO: 53), and nanamer 
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(N=8 in SEQ ID NO: 53) are given in Figure 18. As the figure demonstrates, large 
. amounts of each polymeric rhGH are produced except for the nanamer. 

Example 10 

5 Linkers with convenient restriction sites offer the engineering option to generate a 

multitude of assemblies with cassettes that can be attached to monomers using 
restriction/ligation techniques. The utility of this formulation lies in the breadth of 
assemblies that can be constructed relatively easily. This is especially apparent when the 
linkers themselves are treated as assemblies nested within the construction of the 

10 multimers. Once constructed, these linker assemblies and cassettes, like any other, can 
be reused to produce new assemblies. 

Nested linker assemblies are constructed having a slightly different function than 
the multimer assemblies. They still need an amplification cassette for the polymerization 
of the linker. However, the other cassettes in the assembly enable attachinent of the 

1 5 linker to either a 5 'or 3 ' terminus, whichever is appropriate. 

The example given here is a series of linkers, having amino acid sequence GZGS, 
where Z is an arbitrary sequence of arbitrary length. The series of linkers in Table 1 
below share features that enable them to be treated similarly in terms of their engineering. 
All but one has a Glycine at the N-terminus of the linker that can be coded by an Nael 

20 restriction site at the 5' end for blunt end ligation of a 5 '-terminal cassette to a monomer 
pre-cassette. For the other linker, GS, a synthetic DNA fi-agment must be ligated to the 
monomer pre-cassette without propagation within a plasmid. Each of the linkers ends in 
the protein sequence GS, so that the restriction pair is identical to earlier examples 
utilizing the Bell and BamHI sites. 
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Table 1 



Linker protein 
monomer unit 


5 '-terminal cassette DN A 
sequence 


Amplification cassette DNA 
sequence 


GS 


GGATCC 


TGATCAGGATCC 


GGS 


GCCGGCGGATCC 


TGATCAGGCGGATCC 


GGGS 


GCCGGCGGCGGATCC 


TGATCAGGCGGCGGATCC 


GGGGS 


GCCGGCGGCGGCGGATCC 


TGATCAGGCGGGGGCGGATCC 


GZGS 


GCCGGCYGGATCC 


TGATCAGGCYGGATCC 










Z is an arbitrary protein 
sequence, and Y is its DNA 
coding sequence. 





5 

As a single example of the engineering of the linker assembly, we construct the 
(G4S)x linker, where x indicates the degree of polymerization of the nionomer sequence. 
The assembly is engineered like any other, and it falls into the scheme shown in Figure 
1. The specifics are shown in Figure 19. 

10 Two synthetic DNA sequences are needed, SEQ. ID NO: 56 and SEQ ID NO: 57. 

The first, the 5'-terminal cassette labeled as pODl lA in Figure 19, is the sequence 
enabling addition of the linker sequence to other cassettes. It is flanked by a Ncol site, 
and thus with an upstream Ndel site, for cloning flexibility at the 5' terminus, contains 
the Nael site to create the blunt end ligation with the glycine codon at the 5' terminus, the 

15 linker sequence, and finally the BamHI site within the GS codons. Plasmid p04 is 

prepared by digestion with NgoMIV, digestion with Mung Bean Nuclease, and finally re- 
ligation to destroy the internal Nael site, creating plasmid p04A3. This altered plasmid, 
along with the insert, is digested with Ncol and BamHI and the appropriate fragments are 
ligated together. The resulting plasmid is labeled pODl 1 A. The open reading frame 

20 translation between the cleaved Nael and the entire BamHI sites is G4S. 
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SEQ ID NO: 57 is the sequence for the ampHfication cassette to create multimers 
of the G4S Hnker. It is flanked by an Ncol site, again for cloning flexibility. It has the 5' 
Bell site from the restriction pair, followed by the G4S coding sequence that ends with the 
BamHI site. It is inserted into p04 by cutting both plasmid and insert with Ncol and 
5 BamHI and ligating the appropriate fragments together, as shown in Figure 19. The 
resulting plasmid is labeled pOD 11 B. 

Amplification cassette pODl IB is polymerized by the scheme shown in Figure 3, 
left hand side, to create a dimer. In this instance the decision to follow the left hand side 
scheme results in larger fragments that are easier to isolate. Plasmid pODl IB is digested 

10 with Ndel and Bell and the large fragment is isolated. Separately, the same parent 

plasmid is digested with Ndel and BamHI, this time isolating the small fragment. The 
two isolated fragments are then ligated together, destroying the internal Bell arid BamHI 
sites, but preserving the flanking ones. The resulting plasmid is labeled pODl 1B2, the 
DNA insert is listed in SEQ ID NO: 58, and the ORF translation is listed in SEQ ID NO: 

15 59. The sequence codes for the dimer (048)2. The process can be repeated with different 
starting cassettes to generate any (G4S)x linker. In this manner, (048)4 can be generated 
by digesting pOD12B with Ndel and Bell and saving the large fragment and hgating in 
the small fragment generated by digesting it with NdeT and BamHI. 

The engineering of the O4S assembly enables the construction of a OH multimer 

20 assembly with the (048)3 linker. The (048)3 5 '-terminal cassette for ligation to the OH 
sequences is generated following the general scheme shown in Figure 4. Plasmid 
pODl 1B2 is digested with Ndel and Bell, and the large fragment is isolated. The small 
fragment resulting from digestion of pODll A with Ndel and BamHI is hgated in, 
creating plasmid pOD13A. The DNA and ORF sequences for the insert are listed in SEQ 

25 ID NO: 60 and SEQ ID NO: 61, respectively. The insert in pOD13A enables ligation of 
the (048)3 linker to the 3' end of any sequence ending in a blunt end. 

Example 1 1 

Engineering of the OH (048)3 assembly requires two new ends to the OH gene. 
30 The Bell 5' restriction pair member is needed on the 5' terminus of the amplification and 
3'-terminal cassettes, and a blunt end immediately after the last codon of OH is needed 
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on the 3' terminus of the 5 '-terminal and amplification cassettes for ligation of the (648)3 
linker. There are many ways to get a blunt end at the 3' terminus of GH. Disclosed here 
is the use of an Ncol site that is made blunt after digestion with Mung Bean Nuclease. In 
addition, it is convenient to introduce a stop codon flanked by the Sail restriction site at 
5 the 3' terminus of the GH gene for construction of an insertion cassette, as shown in 
general in Figure 5. 

Three new primers are used to generate the new termini on two new GH inserts 
by PGR using P0A03 as template, as shown in Figure 20. The 5' primer is listed in SEQ 
ID NO: 62. It contains a flanking Ndel site, the Bell 5' restriction pair member, and 

10 sequence complementary to the GH 5' terminus. It is used for both PGR reactions. The 
3' primers are listed in SEQ ID NO: 63 and SEQ ID NO: 64. Both contain sequence 
complementary to the 3' terminus of GH. The first codes for the Ncol site at the 3' 
terminus for creation of a blunt end after the last GH base pair and a flanking EcoRI site, 
while the second introduces a stop codon followed by a Sail restriction site. 

15 The PGR fragments are ligated into plasmid backbones as shown in Figure 20. 

The PGR fragment resulting from use of the prirners listed in SEQ ID NO: 62 and SEQ 
ID NO: 63 is digested with Ndel and EcoRI and ligated into similarly digested p04Al to 
yield pOA04, while the fragment resulting from use of the primers listed in SEQ ID NO: 
62 and SEQ ID NO: 64 is digested with Bell and Sail and ligated into similarly digested 

20 pOAl IGl to give pOA41G. The insert in pOA04 between the Bell and blunt ended Ncol 
sites has the DNA sequence listed in SEQ ID NO: 65 and corresponding ORE translation 
Usted in SEQ ID NO: 66. Likewise, the insert in pOA41G, the 3 '-terminal cassette, 
between the Bell and Sail sites has the DNA sequence listed in SEQ ID NO: 67 and ORE 
translation Hsted in SEQ ID NO: 68. 

25 The amplification cassette is generated first by ligating the (G4S)3 linker from 

plasmid pOD13A with the insert in pOA04, as shown in Figure 21. Plasmid pOD13A is 
digested with Nael and HindlU, and the small fragment is isolated. It is ligated into 
pOA04 after digestion first with Ncol, then Mung Bean Nuclease, and finally Hindlll to 
yield pOA43B. The resulting DNA sequence for the amplification cassette between the 

30 Bell and BamHI sites is listed in SEQ ID NO: 69, with corresponding ORE translation in 
SEQ ID NO: 70. 
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The direct start 5 '-terminal cassette is generated by combining the 5' elements 
from pOAl 1 Al with the 3' elements from pOA43B, as shown in Figure 21 . The small 
fragment resuhing from digesting pOA43B with PstI and EcoRI is isolated. It is ligated to 
the large fragment resulting from digestion of pOAl 1 Al with the same enzymes to yield 
5 pOA43 A. The DNA sequence for the insert between Ndel and BamHI is listed in SEQ ID 
NO: 71, with corresponding ORF translation in SEQ ID NO: 72. 

The polymerization of the amplification cassettes again follows the scheme in 
Figure 3. The general formulas for the insert DNA and corresponding ORF translation 
for the Nmer amplification cassette are listed in SEQ ID NO: 73 and SEQ ID NO: 74. 

10 The ligation of the cassettes for the GH (G4S)3 linker assembly to create a 

mul timer expression cassette follows the previously described scheme shown in Figure 7 
and demonstrated in Example 4. The insertion cassette is first generated with the 5'- and 
3 '-terminal cassettes using EcoRI and Sail digestions. An amplification cassette insert is 
first isolated after digestion with Bell and EcoRI and then spliced into the insertion 

15 cassette after digestion using BamHI and EcoRI. The resultant construct is subsequently 
digested with BamHI and Bell and re-ligated. The resulting N+2 multimer expression 
cassette, where N is the degree of polymerization of the amplification cassette used, has 
DNA and corresponding ORF translation sequences listed in SEQ ID NO: 75 and SEQ 
ID NO: 76. Transfer of the insert into a suitable expression system yields multimeric GH 

20 with (G4S)3 linker. 

Example 12 

The last example is an alternative construction for a GH direct fiision assembly. 
It involves the use of an incompatible restriction pair that is blunt ended for ligation. 
25 Construction of this new assembly is done by ligating together fi*agments fi*om earlier 
cassettes, since they already contain the needed elements. The construction scheme is 
shown in Figure 22. 

The 5 '-terminal cassette is labeled pOA51A. It is generated by combining 
elements from pOAl 1 Al and pOA04. Plasmid pOAl 1 Al is digested with PstI and EcoRI 
30 and the open plasmid isolated. This is ligated with the insert isolated after digesting 
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pOA04 with the same enzymes. The result, pOA5 1 A, has DNA and corresponding ORF 
translation Hsted in SEQ ID NO: 77 and SEQ ID NO: 78. 

The ampHfication and 3 '-terminal cassettes are constructed in exactly the same 
manner as the 5 '-terminal cassette, except for substituting which plasmids are digested. 
5 For the ampHfication cassette, plasmid pOAOl is ligated with the insert from pOA04. The 
insert DNA and corresponding ORF sequences are listed in SEQ ID NO: 79 and SEQ ID 
NO: 80. Likewise, for the 3'-terminal cassette, plasmid pOAOl is ligated with the insert 
from pOA03. Its insert DNA and corresponding ORF translation are listed in SEQ ID 
NO: 81 and SEQ ID NO: 82. 

10 The polymerization of ampHfication cassettes still follows the scheme in Figure 

3. However, digestion at a restriction pair member now requires the additional blunt 
ending of its overhang. Figure 23 shows the specifics for the current assembly. The 
digestions, of the cassette are done sequentially so that the restriction pair is blunt ended, 
but the flanking restriction sites are left intact. The general formulas for the amplification 

1 5 cassettes are listed in SEQ ID NO: 83 and SEQ ID NO: 84. 

The ligation of the multimer assembly cassettes is done sequentially as shown in 
Figure 4. The digestion of any plasmid is performed as described above with blunt 
ending of the restriction pair member first. The general formulas for the resulting 
multimer expression cassette insert, using the Nmer amplification cassette, are listed in 

20 SEQ ID NO: 85 and SEQ ID NO: 86. 

In practice, ligations of cassettes frorri this assembly involves more steps, but the 
technique's almost universal applicability may make it the method of choice in some 
instances. For the current case, the assembly given in Examples 1-4 is easier to 
manipulate. 

25 
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4: 4: 4: 4c 4: 



Those skilled in the art will recognize many equivalents to the examples presented 
herein, using different monomers, linkers, restriction pairs, flanking restriction sites, 5' 
5 specific sequences, 3' specific sequences, and ligation strategies. For example, the 

methods are flexible as to the order of ligating 5'-temiinal cassettes, 3 '-terminal cassettes, 
and amplification cassettes, and in ligating amplification cassettes to one another to form 
higher order amplification cassettes. Combining elements of the following claims 
presented here and in the description, including the examples, is within the scope of the 
10 invention and are encompassed in the following claims. 

All references cited herein, including the bibliography, are incorporated by 
reference in their entireties. 
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SEQUENCE LISTING 



<110> Gentide Biopharmaceuticals , Inc. 
Bussell, Stuart 

<120> METHODS TO CONSTRUCT MULTIMERIC DNA AND POLYMERIC PROTEIN 
SEQUENCES AS DIRECT FUSIONS OR WITH LINKERS 

<130> GNT-OOlOl.P.l-US 

<150> US 60/396,466 
<151> 2002-07-16 

<160> 86 

<17.0> Patentin version 3.0 

<210> 1 

<211> 573 . 

<212> DNA 

<213> Homo sapiens 

<400> 1 

ttcccaacca ttcccttatc caggcttttt gacaacgcta tgctccgcgc ccatcgtctg 
60 

caccagctgg cctttgacac ctaccaggag tttgaagaag cctatatccc aaaggaacag 
120 

aagtattcat tcctgcagaa cccccagacc tccctctgtt tctcagagtc tattccgaca 
180 

ccctccaaca gggaggaaac acaacagaaa tccaacctag agctgctccg catctccctg 
240 
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ctgctcatcc agtcgtggct ggagcccgtg cagttcctca ggagtgtctt cgccaacagc 
300 

ctggtgtacg gcgcctctga cagcaacgtc tatgacctcc taaaggacct agaggaaggc 
5 360 

atccaaacgc tgatggggag gctggaagat ggcagccccc ggactgggca gatcttcaag 
420 

10 cagacctaca gcaagttcga cacaaactca cacaacgatg acgcactact caagaactac 
. 480 



15 



gggctgctct actgcttcag gaaggacatg gacaaggtcg agacattcct gcgcatcgtg 
540 

cagtgccgct ctgtggaggg cagctgtggc ttc 
573 



20 <210> 2 

<211> 191 

<212> PRT 

<213> Homo sapiens 



25 



30 . <220> 

<221> mat_peptide 

35 <222> (1) . . 0 



40 



<400> 2 

Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg 
15 10 15 



Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu 
45 20 25 30 

Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro 
35 40 45 

50 Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg 
.50 55 60 

Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
65 ■ 70 75 80 

55 

Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val 
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85 90 95 

Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp 
100 105 110 

Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu 
115 120 125 

Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser 
130 135 140 

Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr 
145. 150 155 160 

Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe 
165 170 175 

Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

<210> 3 
<211> 38 
<212> DNA . 
<213> Artificial 



<220> 

<223> synthetic sequence 

<400> 3 

ggaattccat atgttcccaa ccattccctt atccaggc 
38 

<210> 4 

<211> 36 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 

<400> 4 

cgcggatccc tagaagccac agctgccctc cacaga 
36 
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<210> 5 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 

<400> 5 

taatacgact cactatagg 
19 

<210> 6 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 

<400> 6 

tgctagttat tgctcagcgg t 

21 

<210> 7 

<211> 588 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 

<400> 7 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 
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cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 

gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
5 180 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 

10 tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 



15 



30 



40 



50 



aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 



ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
20 480 

aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

25 atcgtgcagt gccgctctgt ggagggcagc tgtggcttct agggatcc 
588 



<210> 8 

<211> 192 

<212> PRT 

35 <213> Artificial 



<220> 

<223> synthetic sequence 
<220> 

45 <221> mat_peptide 
<222> (1) . . 0 



<400> 8 



Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 5 10 15 

55 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 
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Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 



5 Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

10 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
15 100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 125 

20 Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

25 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 



Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
30 180 185 190 

<210> 9 

<211> 2907 

35 

<212> DNA 

<213> Artificial 



40 



<220> 

<223> synthetic sequence 

45 

<400> 9 

ccggatatag ttcctccttt cagcaaaaaa cccctcaaga cccgtttaga ggccccaagg 
60 

50 ggttatgcta gttattgctc agcggtggca gcagccaact cagcttcctt tcgggctttg 
120 

tttageagcc taggtattaa tcaattagtg gtggtggtgg tggtggtggt gctcgagtgc 
180 

55 

ggccgcaagc ttgtcgacgg agctcgcctg caggcgcgcc aaggcctgta cagaattcgg 
240 
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atccccgata tcccatggga ctcttgtcgt cgtcatcacc ggagccacca ccggtaccca 
300 

5 gatctgggct gtccatgtgc tggcgttcga atttagcagc agcggtttct ttcataccaa 
360 

ttgcagtact accgcgtggc accagacccg cggagtgatg gtgatggtga tgaccagaac 
420 

10 

cactagtaca cacatatgta tatctccttc ttaaagttaa acaaaattat ttctagaggg 
480 

gaattgttat ccgctcacaa ttcccctata gtgagtcgta . ttaatttcgc gggatcgaga 
15 540 

tcgatctcga tcctctacgc cggacgcatc gtggccggca tcaccggcgc cacaggtgcg 
600 

20 gttgctggcg cctatatcgc cgacatcacc gatggggaag atcgggctcg ccacttcggg 
660 

^ ctcatgagcg cttgtttcgg cgtgggtatg gtggcaggcc ccgtggccgg gggactgttg 
720 

25 

ggcgccatct ccttgcatgc atggcgtaat catggtcata gctgtttcct gtgtgaaatt 
780 

gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt aaagcctggg 
30 840 

gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt 
900 

35 cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg agaggcggtt 
960 

tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc 
1020 

40 

tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg 
1080 

ataacgcagg aaagaacatg tgagcaaaag gccagcaaiaa ggccaggaac cgtaaaaagg 
45 1140 

ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac 
1200 

50 gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg 
1260 

gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct 
1320 

55 

ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg 
1380 
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tgtaggtcgt tcgctccaag ctgggctgtg 
1440 

5 gcgccttatc cggtaactat cgtcttgagt 
1500 

tggcagcagc cactggtaac aggattagca 
1560 

10 

tcttgaagtg gtggcctaac tacggctaca 
1620 

tgctgaagcc agttaccttc ggaaaaagag 
15 1680 

ccgctggtag cggtggtttt tttgtttgca 
1740 

20 ctcaagaaga tcctttgatc ttttctacgg 
1800 

gttaagggat tttggtcatg agattatcaa 
1860 . 

25 

aaaaatgaag ttttaaatca atctaaagta 
1920 

aatgcttaat cagtgaggca cctatctcag 
30 1980 

cctgactccc cgtcgtgtag ataactacga 
2040 

35 , ctgcaatgat accgcgagac ccacgctcac 
2100 

cagccggaag ggccgagcgc agaagtggtc 
2160 

40 

ttaattgttg ccgggaagct agagtaagta 
2220 

ttgccattgc tacaggcatc gtggtgtcac 
45 2280 

ccggttccca acgatcaagg cgagttacat 
2340 

50 gctccttcgg tcctccgatc gttgt.cagaa 
2400 

ttatggcagc actgcataat tctcttactg 
2460 

55 

ctggtgagta ctcaaccaag tcattctgag 
2520 



tgcacgaacc ccccgttcag cccgaccgct 
ccaacccggt aagacacgac ttatcgccac 
gagcgaggta tgtaggcggt gctacagagt 
ctagaaggac agtatttggt atctgcgctc 
ttggtagctc ttgatccggc aaacaaacca 
agcagcagat tacgcgcaga aaaaaaggat 
ggtctgacgc tcagtggaac gaaaactcac 
aaaggatctt cacctagatc cttttaaatt 
tatatgagta aacttggtct gacagttacc 
cgatctgtct atttcgttca tccatagttg 
tacgggaggg cttaccatct ggccccagtg 
cggctccaga tttatcagca ataaaccagc 
ctgcaacttt atccgcctcc atccagtcta 
gttcgccagt taatagtttg cgcaacgttg 
gctcgtcgtt tggtatggct tcattcagct 
gatcccccat gttgtgcaaa aaagcggtta 
gtaagttggc cgcagtgtta tcactcatgg 
tcatgccatc cgtaagatgc ttttctgtga 
aatagtgtat gcggcgaccg agttgctctt 



60 



10 



25 



35 



45 



55 



gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca 
2580 

ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt 
2640 

cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt 
2700 

ctgggtgagc aaaaaqagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga 
2760 



aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat cagggttatt 
15 2820 

gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc 
2880 

20 gcacatttcc ccgaaaagtg ccacctg 
2907 



<210> 10 

<211> 73 

<212> DNA 

30 <213> Artificial 



<220> 

<223> synthetic sequence 



<220> 

40 <221> sig_peptide 



<222> (7) . . (69) 



<400> 10 

cgccatatga aaaagacagc tatcgcgatt gcagtggcac tggctggttt cgctaccgta 
60 



50 gcgcaagctt gag 
73 



<210> 11 
<211> 21 



61 



<212> PRT 

<213> Escherichia coli 



5 



20 



25 



<220> 
10 <221> SIGNAL 

<222> (1)..(21) 

15 

<400> 11 

Met Lys Lys Thr Ala He Ala He Ala Val Ala Leu Ala Gly Phe Ala 
15 10 15 

Thr Val Ala Gin Ala 
20 

<210> 12 
<211> 42 
<212> DNA 
30 <213> Artificial 



<22Q> 

35 

<223> synthetic . sequence 
<400> 12 

ggacatatgc tgaagctttc ccaaccattc ccttatccag gc 
40 42 

<210> 13 

45 <211> 28 

<212> DNA 

<213> Artificial 

50 

<220> 

55 <223> synthetic sequence 

<400> 13 



62 



10 



cgcggatccc tccacagagc ggcactgc 
28 



5 <210> 14 

<211> 578 

<212> DNA 

<213> Artificial 



15 <220> 

<223> synthetic sequence 

<400> 14 . 
20 catatgctga agctttccca accattccct tatccaggct ttttgacaac gctatgctcc 
60 



25 



35 



50 



55 



gcgcccatcg tctgcaccag ctggcctttg acacctacca ggagtttgaa gaagcctata 

120 

tcccaaagga acagaagtat tcattcctgc .agaaccccca gacctccctc tgtttctcag 
180 



agtctattcc gacaccctcc aacagggagg aaacacaaca gaaatccaac ctagagctgc 
30 240 



tccgcatctc cctgctgctc atccagtcgt ggctggagcc cgtgcagttc ctcaggagtg 
300 

tcttcgccaa cagcctggtg tacggcgcct ctgacagcaa cgtctatgac ctcctaaagg 
360 



acctagagga aggcatccaa acgctgatgg ggaggctgga agatggcagc ccccggactg 
40 420 

ggcagatctt caagcagacc tacagcaagt tcgacacaaa ctcacacaac gatgacgcac 
480 

45 tactcaagaa ctacgggctg ctctactgct tcaggaagga catggacaag gtcgagacat 
540 



tcctgcgcat cgtgcagtgc cgctctgtgg agggatcc 
578 



<210> 15 
<211> 630 
<212> DNA 



63 



<213> 



Artificial 



5 <220> 

<223> synthetic sequence 
<220> 

<221> source 

<222> (4) . . (66) 

15 <223> Eschericia coli 



10 



20 



<220> 

<221> source 

<222> (67) . . (630) 

25 <223> homo sapiens 



<400> 15 

30 catatgaaaa agacagctat cgcgattgca gtggcactgg ctggtttcgc taccgtagcg 
60 

caagctttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
120 

35 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
180 

gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
40 240 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
300 

45 tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
360 

aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
420 

50 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
480 

ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
55 540 



64 



aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
600 

atcgtgcagt gccgctctgt ggagggatcc 
630 



<210> 16 

<211> 208 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<220> 

<221> SIGNAL 

<222> (1) . . (21) 

<223> from Eschericia coli 

<400> 16 

Met Lys Lys Thr Ala lie Ala lie Ala Val Ala Leu Ala Gly Phe Ala 
1 5 10 15 

Thr Val Ala Gin Ala Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp 
20 25 30 

Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr 
35 40 45 

Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser 
50 55. 60 

Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro 
65 70 : 75 80 

Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu 
85 90 95 ■ 

Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin 
100 105 110 

Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp 
115 120 125 



65 



Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr 
130 135 140 

Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe 
5 145 150 155 160 

Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala 
165 170 175 

10 Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp 
180 185 190 



15 



45 



Lys Val Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly 
195 200 205 



<210> 17 
<211> 570 
20 <212> DNA 

<213> Artificial 

25 

<220> 

<223> synthetic sequence 
30 <400> 17 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
35 120 

gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 

40 ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 

tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 

300 ' ' 

aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 



gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
50 420 

ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

55 aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 



66 



atcgtgcagt gccgctctgt ggagggatcc 
570 



<210> 18 

<211> 188 

<212> PRT 

<213> Artificial 



<220> 



<223> synthetic sequence 
<400> .18 



Met Phe Pro Thr lie Pro Leu Ser 
1 5 

Arg Ala His Arg Leu His Gin Leu 
20 

Glu Glu Ala Tyr lie Pro Lys Glu 
35 - 40 

Pro Gin Thr Ser Leu Cys Phe Ser 
50 55 

Arg Glu Glu Thr Gin Gin Lys Ser 
65 70 

Leu Leu Leu He Gin Ser Trp Leu 
85 



Arg Leu Phe Asp Asn Ala Met Leu 
10 15 

Ala Phe Asp Thr Tyr Gin Glu Phe 
25 30 

Gin Lys Tyr Ser Phe Leu Gin Asn 
45 

Glu Ser He Pro Thr Pro Ser Asn 
60 

Asn Leu Glu Leu Leu Arg He Ser 
75 80 

Glu Pro Val Gin Phe Leu Arg Ser 
90 95 



Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg 
115 120 125 



Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr 
.130 135 140 



Ser Lys Phe Asp Thr Asn Ser His 
145 150 

Tyr Gly Leu Leu Tyr Cys Phe Arg 
165 

Phe Leu Arg He Val Gin Cys Arg 
180 



Asn Asp Asp Ala Leu Leu Lys Asn 
155 160 

Lys Asp Met Asp Lys Val Glu Thr 
170 175 

Ser Val Glu Gly 
185 



67 



<210> 19 

<211> 52 

5 <212> DNA 

<213> Artificial 

10 

<220> 

<223> synthetic sequence 

15 <400> 19 

taccatatga catgatcatg tggcttcttc ccaaccattc ccttatccag go 
52 

20 

<2i0> 20 

<211> 588 

25 <212> DNA 

<213> Artificial 



30 , 

<220> 

<223> synthetic sequence 
35 <400> 20 

catatgacat gatcatgtgg cttcttccca accattccct tatccaggct ttttgacaac 
60 

gctatgctcc gcgcccatcg tctgcaccag ctggcctttg acacctacca ggagtttgaa 
40 120 

gaagcctata tcccaaagga acagaagtat tcattcctgc agaaccccca gacctccctc 
180 

45 tgtttctcag agtctattcc gacaccctcc aacagggagg aaacacaaca gaaatccaac 
240 

ctagagctgc tccgcatctc cctgctgctc atccagtcgt ggctggagcc cgtgcagttc 
300 

ctcaggagtg tcttcgccaa cagcctggtg tacggcgcct ctgacagcaa cgtctatgac 
360 



50 



ctcctaaagg acctagagga aggcatccaa acgctgatgg ggaggctgga agatggcagc 
55 420 



68 



ccccggactg ggcagatctt caagcagacc tacagcaagt tcgacacaaa ctcacacaac 
480 



gatgacgcac tactcaagaa ctacgggctg ctctactgct tcaggaagga catggacaag 
5 540 

gtcgagacat tcctgcgcat cgtgcagtgc cgctctgtgg agggatcc 
588 

10 

<210> 21 
<211> 191 
15 <212> PRT 

<213> Artificial 



20 

<220> 

<223> synthetic sequence 

25 <400> 21 . 

Ser Cys Gly Phe Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn 
1 5 10 15 

30 Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr 
20: ' 25 30 

Gin Glu Phe Glu Glu Ala Tyr lie Pro.Lys Glu Gin Lys Tyr Ser Phe 
35 ,40 45 

Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr 

50 55 60 



35 



Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu 
40 65 70 .75 80 

Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe 
85 90 95 

45 Leii Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser 
100 105 110 

Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu 
115 120 125 

50 

Met Gly Arg Leu Glu Asp Gly Ser Prp Arg Thr Gly Gin lie Phe Lys 
130 135 140 

Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu 
55 145 150 155 .160 

Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys 



69 



165 170 175 

Val GIu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly 
180 185 190 

5 

<210> 22 
<211> 42 
10 <212> DNA 

<213> Artificial 

15 

<220> 

<223> synthetic sequence 

.20 <400> 22 

tacgaattcc attgatcatg tggcttctag taggtcgacg at 
42 

25 <210> 23 

<211> 4 

<212> PRT 

<213> Artificial 



30 



35 <220> 

<223> synthetic sequence 
<400> 23 



40 



45 



55 



Ser Cys Gly Phe 
1 



<210> 24 

<211> 51 

<212> DNA 

50 <213> Artificial 



<220> 

<223> synthetic sequence 



70 



<400> 24 

tacg^attcc attgatcatg tggcttcaaa aagaaatagt aggtcgacga t 
51 



<210> 25 

<211> 7 

10 <212> PRT 

<213> Artificial 

15 

<220> 

<223> synthetic sequence 

20 <400> 25 

Ser Cys Gly Phe Lys Lys Lys 
1 5 

25 <210> 26 

<211> 1161 

<212> DNA 

<213> Artificial 



30 



35 <220> 

<223> synthetic sequence 
<400> 26 

40 catatgacat gatcatgtgg cttcttccca accattccct tatccaggct ttttgacaac 
60 

gctatgctcc gcgcccatcg tctgcaccag ctggcctttg acacctacca ggagtttgaa 
120 

45 

gaagcctata tcccaaagga acagaagtat tcattcctgc agaaccccca gacctccctc 
180 

tgtttctcag agtctattcc gacaccctcc aacagggagg aaacacaaca gaaatccaac 
50 240 

ctagagctgc tccgcatctc cctgctgctc atccagtcgt ggctggagcc cgtgcagttc 
300 

55 

ctcaggagtg tcttcgccaa cagcctggtg tacggcgcct ctgacagcaa cgtctatgac 
360 



71 



ctcctaaagg acctagagga aggcatccaa acgctgatgg ggaggctgga agatggcagc 
420 



5 ccccggactg ggcagatctt caagcagacc tacagcaagt tcgacacaaa ctcacacaac 
480 

gatgacgcac tactcaagaa ctacgggctg ctctactgct tcaggaagga catggacaag 
540 

10 

gtcgagacat tcctgcgcat cgtgcagtgc cgctctgtgg agggatcatg tggcttcttc 
600 

ccaaccattc ccttatccag gctttttgac aacgctatgc tccgcgccca tcgtctgcac 
15 660 

cagctggcct ttgacaccta ccaggagttt gaagaagcct atatcccaaa ggaacagaag 
720 

20 tattcattcc tgcagaaccc ccagacctcc ctctgtttct cagagtctat tccgacaccc 
780 

tccaacaggg aggaaacaca acagaaatcc aacctagagc tgctccgcat ctccctgctg 
840 

25 

ctcatccagt cgtggctgga gcccgtgcag ttcctcagga gtgtcttcgc caacagcctg 
900 

gtgtacggcg cctctgacag caacgtctat gacctcctaa aggacctaga ggaaggcatc 
30 960 

caaacgctga tggggaggct ggaagatggc agcccccgga ctgggcagat cttcaagcag 
1020 

35 acctacagca agttcgacac aaactcacac aacgatgacg cactactcaa gaactacggg 
1080 

ctgctctact gcttcaggaa ggacatggac aaggtcgaga cattcctgcg catqgtgcag 
1140 

40 

tgccgctctg tggagggatc c 
1161 



45 <210> 27 

<211> 382 

<212> PRT 

50 

<213> Artificial 



55 <220> 

<223> synthetic sequence 



72 



<400> 27 



Ser Cys Gly Phe Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn 
15 10 15 

Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr 
20 25 30 

Gin Glu Phe Glu Glu Ala Tyr He Pro Lys Glu Gin Lys Tyr Ser Phe 
35 40 45 

Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser He Pro Thr. 
50 55 60 

Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu 
65 70 75 80 

Arg lie Ser Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe 
85 90 95 

Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser 
100 105 110 

Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu 
115 120 125 

Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys 
130 135 140 

Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu 
145 150 155 160 

Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys 
165 170 175 

Val Glu Thr Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser 
180 185 190 

Cys Gly Phe Phe Pro Thr He Pro Leu Ser Arg Leu Phe Asp Asn Ala 
195 200 205 

Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin 
210 215 220 

Glu Phe Glu Glu Ala Tyr He Pro Lys Glu Gin Lys Tyr Ser Phe Leu 
225 230 235 240 

Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser He Pro Thr Pro 
245 250 255 

Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg 
260 265 270 

He Ser Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu 
275 280 285 



73 



Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn 

290 295 300 

Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met 

5 305 310 . 315 320 

Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin 

325 330 335 

10 Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu 

340 345 350 



.15 



20 



30 



Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val 
355 . 360 365 

Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly 
370 375 380 

<210> 28 

<211> 1152 



<212> DNA 
25 <213> Artificial 



<220> 

<223> synthetic sequence 
<220> 

35 <22l> misc_feature 

<222> (574) . . (1146) 

<223> sequence is repeated N-1 times, where N is a positive whole 
40 numbe 



45 <400> 28 

tgatcatgtg gcttcttccc aaccattccc ttatccaggc tttttgacaa cgctatgctc 
60 

cgcgcccafec gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat 
50 120 

atcccaaagg aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca 
180 

55 gagtctattc cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg 
240 



74 



ctccgcatct ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt 
300 

gtcttcgcca acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag 
5 360 

gacctagagg aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact 
420 

10 gggcagatct tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca 
480 

ctactcaaga actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca 
540 

15 

ttcctgcgca tcgtgcagtg ccgctctgtg gagggatcat gtggcttctt cccaaccatt 
600 

cccttatcca ggctttttga caacgctatg ctccgcgccc atcgtctgca ccagctggcc 
20 660 

tttgacacct accaggagtt tgaagaagcc tatatcccaa aggaacagaa gtattcattc 
720 

25 ctgcagaacc cccagacctc cctctgtttc tcagagtcta ttccgacacc ctccaacagg 
780 . 

gaggaaacac aacagaaatc caacctagag ctgctccgca tctccctgct gctcatccag 
840 

30 

tcgtggctgg. agcccgtgca gttcctcagg agtgtcttcg ccaacagcct ggtgtacggc 
900 

gcctctgaca gcaacgtcta tgacctccta aaggacctag aggaaggcat ccaaacgctg 
35 960 

atggggaggc tggaagatgg cagcccccgg actgggcaga tcttcaagca gacctacagc 
1020 

40 aagttcgaca caaactcaca caacgatgac gcactactca agaactacgg gctgctctac 
1080 

tgcttcagga aggacatgga caaggtcgag acattcctgc gcatcgtgca gtgccgctct 
1140 

45 

gtggagggat cc 

1152 • ' 



29 

382 

PRT 

Artificial 



50 



55 



<210> 
<211> 
<212> 
<213> 



75 



<223> synthetic sequence 
<220> 

<221> MISC_FEATURE 

<222> (191) . . (381) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 29 

Ser Cys Gly Phe Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn 
1 5 10 15 

Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr 
20 25 30 

Gin Glu Phe Glu Glu Ala Tyr He Pro Lys Glu Gin Lys Tyr Ser Phe 
35 40 45 

Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser He Pro Thr 
50 55 60 

Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu 
65 .70 75 80 

Arg lie Ser Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe 
85 90 95 

Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp -Ser 
100 105 110 

Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu 
115 120 125 

Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys 
130 135 140 

Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu 
145 150 155 160 

Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys 
165 170 175 

Val Glu Thr Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser 
180 185 190 

Cys Gly Phe Phe Pro Thr He Pro Leu Ser Arg Leu Phe Asp Asn Ala 

76 



195 



200 



205 



Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin 
210 215 220 

Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu 
225 230 235 240 

Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro 
245 250 255 

Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg 
260 265 270 

lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu 

275 280 285 

Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn 
290 295 300 

Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met 
305 310 315 320 

Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin 
325 330 335 

Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu 
340 345 350 

Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys -Val 
355 360 365 

Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly 
370 375 380 

<210> 30 

<211> 606 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 
<400> 30 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 



77 



gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
5 240 

tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 

10 aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 



gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
15 420 

ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

20 aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

atcgtgcagt gccgctctgt ggagggatcc gaattccatt gatcatgtgg cttctagtag 
600 

gtcgac 
606 



25 



30 : <210> 31 

<211> 1737 

<212> DNA 

<213> Artificial 



35 



40 <220> 

<223> synthetic sequence 
<220> 

45 

<221> misc_feature 

<222> (1138) . . (1710) 

50 <223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



55 



<400> 31 



78 



catatgttcc caaccattcc cttatccagg 
60 

cgtctgcacc agctggcctt tgacacctac 
5 120 

gaacagaagt attcattcct gcagaacccc 
180 

10 ccgacaccct ccaacaggga ggaaacacaa 
240 

tccctgctgc tcatccagtc gtggctggag 
300 

15 

aacagcctgg tgtacggcgc ctctgacagc 
360 

gaaggcatcc aaacgctgat ggggaggctg 
20 420 

ttcaagcaga cctacagcaa gttcgacaca 
480 . 

25 aactacgggc . tgctctactg cttcaggaag 
540 

atcgtgcagt gccgctctgt ggagggatca 
600 

30 

aggctttttg acaacgctat gctccgcgcc 
660 

taccaggagt ttgaagaagc ctatatccca 
35 720 

ccccagacct ccctctgttt ctcagagtct 
780 

40 caacagaaat ccaacctaga gctgctccgc 
840 

gagcccgtgc agttcctcag gagtgtcttc 
900 

45 

agcaacgtct atgacctcct aaaggaccta 
960 

ctggaagatg gcagcccccg gactgggcag 
50 1020 

acaaactcac acaacgatga cgcactactc 
1080 

55 aaggacatgg acaaggtcga gacattcctg 
1140 



ctttttgaca acgctatgct ccgcgcccat 
caggagtttg aagaagccta tatcccaaag 
cagacctccc tctgtttctc agagtctatt 
cagaaatcca acctagagct gctccgcatc 
cccgtgcagt tcctcaggag tgtcttcgcc 
aacgtctatg acctcctaaa ggacctagag 
gaagatggca gcccccggac tgggcagatc 
aactcacaca acgatgacgc actactcaag 
gacatggaca. aggtcgagac attcctgcgc 
tgtggcttct tcccaaccat tcccttatcc 
catcgtctgc accagctggc ctttgacacc 
aaggaacaga agtattcatt cctgcagaac 
attccgacac cctccaacag ggaggaaaca 
atctccctgc tgctcatcca gtcgtggctg 
gccaacagcc tggtgtacgg cgcctctgac 
gaggaaggca tccaaacgct gatggggagg 
atcttcaagc agacctacag caagttcgac 
aagaactacg ggctgctcta ctgcttcagg 
cgcatcgtgc agtgccgctc tgtggaggga 



79 



tcatgtggct tcttcccaac cattccctta tccaggcttt ttgacaacgc tatgctccgc 
1200 

gcccatcgtc tgcaccagct ggcctttgac acctaccagg agtttgaaga agcctatatc 
5 1260 

ccaaaggaac agaagtattc attcctgcag aacccccaga cctccctctg tttctcagag 

1320 

10 tctattccga caccctccaa cagggaggaa acacaacaga aatccaacct agagctgctc 
1380 



15 



30 



35 



40 



45 



cgcatctccc tgctgctcat ccagtcgtgg ctggagcccg tgcagttcct caggagtgtc 
1440 

ttcgccaaca gcctggtgta cggcgcctct gacagcaacg tctatgacct cctaaaggac 
1500 



ctagaggaag gcatccaaac gctgatgggg aggctggaag atggcagccc ccggactggg 
20 1560 

cagatcttca agcagaccta cagcaagttc gacacaaact cacacaacga tgacgcacta 
1620 

25 ctcaagaact acgggctgct ctactgcttc aggaaggaca tggacaaggt cgagacattc 
1680 



ctgcgcatcg tgcagtgccg ctctgtggag ggatcatgtg gcttctagta ggtcgac 

1737 



<210> 32 
<211> 574 
<212> PRT 



<213> Artificial 



<220> 

<223> synthetic sequence 
<220> 

<221> MISC_FEATURE 

50 <222> (379) . . (569) 

<223> sequence is repeated N-l times, where N is a positive whole 
numbe 

55 



80 



<220> 
<221> 
<222> 



mat _pep tide 
(1) - . 0 



<400> 32 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 5 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His. Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 



Phe Leu Arg lie Val Gin 
180 

Phe Pro Thr lie Pro Leu 
195 

Ala His Arg Leu His Gin 
210 

Glu Ala Tyr lie Pro Lys 
225 230 

Gin Thr Ser Leu Cys Phe 
245 



Cys Arg Ser Val Glu 
185 

Ser Arg Leu Phe Asp 
200 

Leu Ala Phe Asp Thr 
.215 

Glu Gin Lys Tyr Ser 
235 

Ser Glu Ser lie Pro 
250 



Gly Ser Cys Gly Phe 
190 

Asn Ala Met Leu Arg 
205 

Tyr Gin Glu Phe Glu 
220 

Phe Leu Gin Asn Pro 
240 

Thr Pro Ser Asn Arg 
255 
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Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
260 265 270 

Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val 
275 280 285 

Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp 
290 295 300 

Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu 
305 310 315 320 

Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser 

325 330 335 

Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr 
340 345 350 

Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe 
355 360 365 

Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Phe 
370 375 380 

Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala 
385 390 395 400 

His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu 
405 410 415 

Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin 
420 425 430 

Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu 
435 440 445 

Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu 
450 455 460 

Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe 
465 470 475 480 

Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu 
485 490 495 

Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu 
500 505 510 

Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr Ser Lys 
515 520 525 

Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly 
530 535 540 

Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu 



82 



5 



30 



45 



545 550 555 560 

Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
565 570 



<210> 33 

<211> 55 

10 <212> DNA 

<213> Artificial 

15 

<220> 

<223> synthetic sequence 

20 <400> 33 

taccatatga catgatcatg tggcttcggt ttcccaacca ttcccttatc caggc 
55 

25 <210> 34 

<211> 591 

<212> DNA 

<213> Artificial 



35 <220> 

<22 3> synthetic sequence 
<400> 34 

40 catatgacat gatcatgtgg cttcggtttc ccaaccattc ccttatccag gctttttgac 
60 



aacgctatgc tccgcgccca tcgtctgcac cagctggcct ttgacaccta ccaggagttt 
120 

gaagaagcct atatcccaaa ggaacagaag tattcattcc tgcagaaccc ccagacctcc 
180 



ctctgtttct cagagtctat tccgacaccc tccaacaggg aggaaacaca acagaaatcc 
50 240 

aacctagagc tgctccgcat ctccctgctg ctcatccagt cgtggctgga gcccgtgcag 
300 

55 ttcctcagga gtgtcttcgc caacagcctg gtgtacggcg cctctgacag caacgtctat 
360 



83 



gacctcctaa aggacctaga ggaaggcatc caaacgctga tggggaggct ggaagatggc 
420 

agcccccgga ctgggcagat cttcaagcag acctacagca agttcgacac aaactcacac 
480 

aacgatgacg cactactcaa gaactacggg ctgctctact gcttcaggaa ggacatggac 
540 

aaggtcgaga cattcctgcg catcgtgcag tgccgctctg tggagggatc c 
551 



<210> 35 

<211> 192 

<212> PRT 

<213> Artificial 

<22 0> 

<223> synthetic sequence 
<400> 35 

Ser Cys Gly Phe Gly Phe Pro Thr ?Ile Pro Leu Ser Arg Leu Phe Asp 
15 10 15 

Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr 
20 25 30 . 

Tyr Gin Glu Phe Glu Glu Ala Tyr He Pro Lys Glu Gin Lys Tyr Ser 
35 40 45 

Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser He Pro 
50 55 60 

Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu 
65 70 75 80 

Leu Arg He Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin 
85 90 95 

Phe Leu Arg Ser Val Phie Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp 
100 105 110 

Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr 
115 120 125 

Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe 
130 135 140 

Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala 
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5 



10 



15 



35 



45 



145 150 155 160 

Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp 
165 170 175 

Lys Val Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly 
180 185 190 

<210> 36 
<211> 1158 



<212> DNA 

<213> Artificial 



20 <220> 

<223> synthetic sequence 
<220> 

25 

<221> misc_f eature 

<222> (577) . .(1152) 

30 <223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 36 

tgatcatgtg gcttcggttt cccaaccatt cccttatcca ggctttttga caacgctatg 
60 



40 ctccgcgccc atcgtctgca ccagctggcc tttgacacct accaggagtt tgaagaagcc 
120 



tatatcccaa aggaacagaa gtattcattc ctgcagaacc cccagacctc cctctgtttc 
180 

tcagagtcta ttccgacacc ctccaacagg gaggaaacac aacagaaatc caacctagag 
240 



ctgctccgca tctccctgct gctcatccag tcgtggctgg agcccgtgca gttcctcagg 
50 300 

agtgtcttcg ccaacagcct ggtgtacggc gcctctgaca gcaacgtcta tgacctccta 
360 

.55 aaggacctag aggaaggcat ccaaacgctg atggggaggc tggaagatgg cagcccccgg 
420 



85 



actgggcaga tcttcaagca gacctacagc aagttcgaca caaactcaca caacgatgac 
480 

gcactaetca agaactacgg gctgctctac tgcttcagga aggacatgga caaggtcgag 
5 540 

acattcctgc gcatcgtgca gtgccgctct gtggagggat catgtggctt cggtttccca 
600 

10 accattccct tatccaggct ttttgacaac gctatgctcc gcgcccatcg tctgcaccag 
660 



15 



30 



40 



ctggcctttg acacctacca ggagtttgaa gaagcctata tcccaaagga acagaagtat 
720 

tcattcctgc agaaccccca gacctccctc tgtttctcag agtctattcc gacaccctcc 
780 



aacagggagg aaacacaaca gaaatccaac ctagagctgc tccgcatctc cctgctgctc 
20 840 

atccagtcgt ggctggagcc cgtgcagttc ctcaggagtg tcttcgccaa cagcctggtg 
900 

25 tacggcgcct ctgacagcaa cgtctatgac ctcctaaagg acctagagga aggcatccaa 
960 



acgctgatgg ggaggctgga agatggcagc ccccggactg ggcagatctt caagcagacc 
1020 

tacagcaagt tcgacacaaa ctcacaicaac gatgacgcac tactcaagaa ctacgggctg 
1080 



ctctactgct tcaggaagga catggacaag gtcgagacat tcctgcgcat cgtgcagtgc 
35 1140 



cgctctgtgg agggatcc 
1158 



<2ip> 37 
<211> 384 
45 <212> PRT 

<213> Artificial 

50 

<220> 

<223> synthetic sequence 
55 <220> 
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<221> MISC_FEATURE 

<222> (192) . . (383) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 37 



Ser Cys Gly Phe Gly Phe Pro Thr lie Pro Leu SerArg Leu Phe Asp 
1 5 10 15 

Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr 
20 25 30 

Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser 
35 40 45 

Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro 
50 55 . 60 

Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu 
65 70 75 80 



Leu Arg lie Ser Leu Leu Leu lie 
85 

Phe Leu Arg Ser Val Phe Ala Asn 
100 

Ser Asn Val Tyr Asp Leu Leu Lys 
115 120 

Leu Met Gly Arg Leu Glu Asp Gly 
130 135 

Lys Gin Thr Tyr Ser Lys Phe Asp 
145 150 

Leu Leu Lys Asn Tyr Gly Leu Leu 
165 

Lys Val Glu Thr Phe Leu Arg lie 
180 

Ser Cys Gly Phe Gly Phe Pro Thr 

195 200 



Gin Ser Trp Leu Glu Pro Val Gin 
90 95 

Ser Leu Val Tyr Gly Ala Ser Asp 
105 110 

Asp Leu Glu Glu Gly lie Gin Thr 
125 

Ser Pro Arg Thr Gly Gin lie Phe 
140 

Thr Asn Ser His Asn Asp Asp Ala 
155 160 

Tyr Cys Phe Arg Lys Asp Met Asp 
170 175 

Val Gin Cys Arg Ser Val Glu Gly 
185 190 

lie Pro Leu Ser Arg Leu Phe Asp 

205 



Asn Ala Met Leu 
210 

Tyr Gin Glu Phe 
225 



Arg Ala His Arg 
215 

Glu Glu Ala Tyr 
230 



Leu His Gin Leu 
220 

lie Pro Lys Glu 
235 



Ala Phe Asp Thr 

Gin Lys Tyr Ser 
240 
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Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro 
245 250 255 



Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu 
260 265 270 

Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin 
275 280 285 

Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp 
290 295 300 

Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr 
305 310 315 320 

Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe 
325 330 335 

Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala 
340 345 350 

Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp 
355 360 365 

Lys Val Glu Thr. Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly 
370 375 380 

<210> 38 

<211> 1743 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 
<220> 

< 2 2 1 > mi sc_f eature 

<222> (1141) . . (1716) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 38 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 
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cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 



gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
5 180 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 

10 tccctgctgc tcatccagtc gtggctggag cccg[tgcagt tcctcaggag tgtcttcgcc 
300 



15 



30 



45 



aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 



ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
20 480 

aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

25 . atcgtgcagt gccgctctgt ggagggatca tgtggcttcg gtttcccaac cattccctta 
600 



tccaggcttt ttgacaacgc tatgctccgc gcccatcgtc tgcaccagct ggcctttgac 
660 

acctaccagg agtttgaaga agcctatatc ccaaaggaac agaagtattc attcctgcag 
720 



aacccccaga cctccctctg tttctcagag tctattccga caccctccaa cagggaggaa 
35 780 

acacaacaga aatccaacct agagctgctc cgcatctccc tgctgctcat ccagtcgtgg 
840 

40 ctggagcccg tgcagttcct caggagtgtc ttcgccaaca gcctggtgta cggcgcctct 
900 



gacagcaacg tctatgacct cctaaaggac ctagaggaag gcatccaaac gctgatgggg 
960 

aggctggaag atggcagccc ccggactggg cagatcttca agcagaccta cagcaagttc 
1020 



gacacaaact cacacaacga tgacgcacta ctcaagaact acgggctgct ctactgcttc 
50 1080 

aggaaggaca tggacaaggt cgagacattc ctgcgcatcg tgcagtgccg ctctgtggag 
1140 

55 ggatcatgtg gcttcggttt cccaaccatt cccttatcca ggctttttga caacgctatg 
1200 
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. ctccgcgccc atcgtctgca ccagctggcc tttgacacct accaggagtt tgaagaagcc 
1260 

tatatcccaa aggaacagaa gtattcattc ctgcagaacc cccagacctc cctctgtttc 
5 1320 

tcagagtcta ttccgacacc ctccaacagg gaggaaacac aacagaaatc caacctagag 
1380 

10 ctgctccgca tctccctgct gctcatccag tcgtggctgg agcccgtgca gttcctcagg 
1440 



15 



30^ 



35 



agtgtcttcg ccaacagcct ggtgtacggc gcctctgaca gcaacgtcta tgacctccta 
1500 

aaggacctag aggaaggcat ccaaacgctg atggggaggc tggaagatgg cagcccccgg 
1560 



actgggcaga tcttcaagca gacctacagc aagttcgaca caaactcaca caacgatgac 
20 1620 

gcactactca agaactacgg gctgctctac tgcttcagga aggacatgga caaggtcgag 
1680 

25 acattcctgc gcatcgtgca gtgccgctct gtggagggat catgtggctt ctagtaggtc 
1740 



gac 
1743 



<210> 39 
<211> 576 
<212> PRT 
<213> Artificial 

40 

<220> 

<223> synthetic sequence 

45 

<220> 

<221> MISC_FEATURE 

50 <222> (380) . . (571) 

<223> sequence is repeated N-1 times, Where N is a positive whole 
numbe 

55 



90 



<220> 

<221> matpeptide 
5 <222> (1)..() 



10 



20 



35 



<400> 39 . 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 5 10 15 



Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
15 20 25 30 



50 



Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 



Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
.50 55 60 



Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
25 65 70 75 .80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

30 Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 



Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 

115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 



Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
40 145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

45 Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Gly Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
195 200 205 



Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
210 215 220 



Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
55 225 230 235 240 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
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245 250 255 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
260 265 270 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
275 280 285 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
290 295 300 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
305 310 315 320 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 

325 330 335 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
340 345 350 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
355 360 365 

Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
370 375 380 



Gly Phe Pro Thr lie 
385 

Arg Ala His Arg Leu 
405 

Glu Glu Ala Tyr lie 
420 

Pro Gin Thr Ser Leu 
435 

Arg Glu Glu Thr Gin 
450 

Leu Leu Leu lie Gin 
465 

Val Phe Ala Asn Ser 
485 

Asp Leu Leu Lys Asp 
500 

Leu Glu Asp Gly Ser 
515 

Ser Lys Phe Asp Thr 
530 



Pro Leu Ser Arg Leu Phe 

390 395 

His Gin Leu Ala Phe Asp 
410 

Pro Lys Glu Gin Lys Tyr 
425 

Cys Phe Ser Glu Ser lie 
440 

Gin Lys Ser Asn Leu Glu 
455 

Ser Trp Leu Glu Pro Val 
470 475 

Leu Val Tyr Gly Ala Ser 
490 

Leu Glu Glu Gly lie Gin 
505 

Pro Arg Thr Gly Gin lie 
520 

Asn Ser His Asn Asp Asp 
535 



Asp Asn Ala Met Leu 
400 

Thr Tyr Gin Glu Phe 
415 

Ser Phe Leu Gin Asn 
430 

Pro Thr Pro Ser Asn 
445 

Leu Leu Arg lie Ser 
460 

Gin Phe Leu Arg Ser 
480 

Asp Ser Asn Val Tyr 
495 

Thr Leu Met Gly Arg 
510 

Phe Lys Gin Thr Tyr 
525 

Ala Leu Leu Lys Asn 
540 
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10 



15 



20 



Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
545 550 555 560 

Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
565 570 575 

<210> 40 

<211> 39 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 

<400> 40 

cgcggatcct catgagaagc cacagctgcc ctccacaga 
.39 

25 

<210> 41 

<211> 591 

<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 41 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 . 

45 cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 

gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 



30 



35 



40 



50 



ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 



tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
55 300 
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aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 



gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
5 420 

ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

10 aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

atcgtgcagt gccgctctgt ggagggcagc tgtggcttct catgaggatc c 
591 

15 

<210> 42 
<211> 193 

20 

<212> PRT 

<213> Artificial 

25 

<220> 

<223> synthetic sequence 
<400> 42 



30 



35 



Met Phe Pro Thr lie Pro Leu. Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 5 10 . • 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
40 35 40 ,45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

45 Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leii Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

50 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
55 115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin thr Tyr 
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130 



135 



140 



Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

5 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
10 180 185 190 

Ser 



15 <210> 43 

<211> 50 

<212> DNA 

<213> Artificial 



20 



25 <220> 

<22 3> synthetic sequence 
<400> 43 

30 catgccatgg ggtggtggag gaagtttccc aaccattccc ttatccaggc 
50 



35 



<210> 44 

<211> i506 

<212> DNA 

40 <213> Artificial 



45 



<220> 

<223> synthetic sequence 
<400> 44 

ccatggggtg gtggaggaag tttcccaaec attcccttat ccaggctttt tgacaacgct 
50 60 

atgctccgcg cccatcgtct gcaccagctg gcctttgaca cctaccagga gtttgaagaa 
120 

55 gcctatatcc caaaggaaca gaagtattca ttcctgcaga acccccagac ctccctctgt 
180 



95 



ttctcagagt ctattccgac accctccaac agggaggaaa cacaacagaa*' atccaaccta 
240 

gagctgctcc gcatctccct gctgctcatc cagtcgtggc tggagcccgt gcagttcctc 
5 300 

aggagtgtct tcgccaacag cctggtgtac ggcgcctctg acagcaacgt ctatgacctc 
360 

10 ctaaaggacc tagaggaagg catccaaacg ctgatgggga ggctggaaga tggcagcccc 
420 



15 



25 



50 



cggactgggc agatcttcaa gcagacctac agcaagttcg acacaaactc acacaacgat 
480 

gacgcactac tcaagaacta cgggctgctc tactgcttca ggaaggacat ggacaaggtc 
540 



gagacattcc tgcgcatcgt gcagtgccgc tctgtggagg gcagctgtgg cttctcatga 
20 600 



ggatcc 
606 



<210> 45 
<211> 198 
30 <212> PRT 

<213> Artificial 

35 

<220> . 

<223> synthetic sequence 
40 <400> 45 

Trp Gly Gly Gly Gly Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe 
15 10 15 

45 Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp 
20 25 30 

Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr 
35 40 45 

Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie 
50 55 60 



Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu 
55 65 70 75 80 

Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val 



96 



85 90 95 

Gin Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser 
100 105 110 

5 

Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin 
115 120 125 

Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie 
10 130 135 140 

Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp 
145 150 155 160 

15 Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met 

165 170 175 

Asp Lys Val Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu 
180 185 190 

20 

Gly Ser Cys Gly Phe Ser 
195 

<210> 46 

25 

<211> 603 

<212> DNA 

30 <213> Artificial 



<220> 

35 

<223> synthetic sequence 
<400> 46 

ccatggggtg gtggaggaag tttcccaacc attcccttat ccaggctttt tgacaacgct 
40 60 

atgctccgcg cccatcgtct gcaccagctg gcctttgaca cctaccagga gtttgaagaa 
120 

45 gcctatatcc caaaggaaca gaagtattca ttcctgcaga acccccagac ctccctctgt 
180 

ttctcagagt ctattccgac accctccaac agggaggaaa cacaacagaa atccaaccta 
240 

50 

gagctgctcc gcatctccct gctgctcatc cagtcgtggc tggagcccgt gcagttcctc 
300 

aggagtgtct tcgccaacag cctggtgtac ggcgcctctg acagcaacgt ctatgacctc 
55 360 
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ctaaaggacc tagaggaagg catccaaacg ctgatgggga ggctggaaga tggcagcccc 
420 



cggactgggc agatcttcaa gcagacctac agcaagttcg acacaaactc acacaacgat 
5 480 

gacgcactac tcaagaacta cgggctgctc tactgcttca ggaaggacat ggacaaggtc 
540 

10 gagacattcc tgcgcatcgt gcagtgcegc tctgtggagg gcagctgtgg cttctaggga 
600 



15 



20 



25 



30 



35 



50 



tec 
603 



<210> 47 

<211> 197 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 47 

Trp Gly Gly Gly Gly Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe 
1 5 .10 15 

Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp 

20 .25 30 



Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr 

40 35 40 45 

Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie 
50 55 60 

45 Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu 
65 70 75 80 



Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val 
85 90 95 

Gin Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser 
100 105 110 



Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin 
55 115 120 125 

Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie 
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130 135 140 

Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp 
145 150 155 160 

5 

Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met 
165 170 175 

Asp Lys Val Glu Thr Phe Leu Airg lie Val Gin Cys Arg Ser Val Glu 
10. 180 185 190 

Gly Ser Cys Gly Phe 
195 

15 <210> 48 

<211> 1200 
<212> DNA 

20 

<213> Artificial 



25 <220> 

<223> synthetic sequence 
<220> 

< 2 2 1 > mi s c_f e a t ur e 

<222> (595) . . (1188) 



30 



35 <223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



40 

<400> 48 

ccatggggtg gtggaggaag tttcccaacc attcccttat ccaggctttt tgacaacgct 
60 

45 atgctccgcg cccatcgtct gcaccagctg gcctttgaca cctaccagga gtttgaagaa 
120 

gcctatatcc caaaggaaca gaagtattca ttcctgcaga acccccagac ctccctctgt 
180 



50 



ttctcagagt ctattccgac accctccaac agggaggaaa cacaacagaa atccaaccta 
240 



gagctgctcc gcatctccct gcbgctcatc cagtcgtggc tggagcccgt gcagttcctc 
55 300 



99 



aggagtgtct tcgccaacag cctggtgtac ggcgcctctg acagcaacgt ctatgacctc 
360 

ctaaaggacc tagaggaagg catccaaacg ctgatgggga ggctggaaga tggcagcccc 
5 420 

cggactgggc agatcttcaa gcagacctac agcaagttcg acacaaactc acacaacgat- 
480 

10 gacgcactac tcaagaacta cgggctgctc tactgcttca ggaaggacat ggacaaggtc 
540 

gagacattcc tgcgcatcgt gcagtgccgc tctgtggagg gcagctgtgg cttctcatgg 
600 



15 



30 



45 



50 



55 



99tggtggag gaagtttccc aaccattccc ttatccaggc tttttgacaa cgctatgctc 
660 



cgcgcccatc gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat 
20 720 

atcccaaagg aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca 
780 

25 gagtctattc cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg 

840 : - 



ctccgcatct ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt 
900 

gtcttcgcca acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag 
960 



gacctagagg aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact 
35 1020 

gggcagatct tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca 
1080 

40 ctactcaaga actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca 
1140 



ttcctgcgca tcgtgcagtg ccgctctgtg gaggg[cagct gtggcttctc atgaggatcc 
1200 



<210> 49 

<211> 396 

<212> PRT 

<213> Artificial 



100 



<220> 

<22 3> synthetic sequence 
<220> 

<221> MISC_FEATURE 

<222> (198).. (395) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 49 

Trp Gly Gly Gly Gly Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe 
1 .5 10 15 

Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp 
20 25 30 

Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr 
35 40 45 

Ser Phe Leu Glh Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie 
50 55 60 

Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu 
65 70 75 80 

Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val 
85 90 95 

Gin Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser 
100 105 110 

Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin 
115 120 125 

Thr Leu. Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie 
130\ 135 140 

Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp 

145 150 155 160 

Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met 
165 170 175 

Asp Lys Val Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu 
180 185 190 

Gly Ser Cys Gly Phe Ser Trp Gly Gly Gly Gly Ser Phe Pro Thr lie 
195 200 205 



101 



Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu 

210 215 220 



His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie 
225 230 235 240 

Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu 
245 250 255 

Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin 
260 265 270 

Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu Leu lie Gin 
275 280 285 

Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe Ala Asn Ser 
290 295 300 

Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp 
305 3.10 315 320 

Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu Asp Gly Ser 
.325 330 335 

Pro Arg . Thr Gly Gin He Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr 
340 345 350 

Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr 
355 360 365 

Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu Arg He Val 
370 375 380 

Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Ser 
385 390 395 

<210> 50 

<211> 1185 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 
<400> 50 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 . 



102 



gaacagaagt attcattcct gcagaacccc 
180 

5 ccgacaccct ccaacaggga ggaaacacaa 
240 

tccctgctgc tcatccagtc gtggctggag 
300 

10 

aacagcctgg tgtacggcgc ctctgacagc 
360 

gaaggcatcc aaacgctgat ggggaggctg 
15 420 

ttcaagcaga cctacagcaa gttcgacaca 
480 

20 aactacgggc tgctctactg cttcaggaag 
540 

atcgtgcagt gccgctctgt ggagggcagc 

600 

25 

ttcccaacca ttcccttatc caggcttttt 
660 

caccagctgg cctttgacac ctaccaggag 
30 720 

aagtattcat tcctgcagaa cccccagacc 
780 

35 ccctccaaca gggaggaaac acaacagaaa 
840 

ctgctcatcc agtcgtggct ggagcccgtg 
900 

40 

ctggtgtacg gcgcctctga cagcaacgtc 
960 

atccaaacgc tgatggggag gctggaagat 
45 1020 

cagacctaca gcaagttcga cacaaactca 
1080 

50 gggctgctct actgcttcag gaaggacatg 
1140 

cagtgccgct ctgtggaggg cagctgtggc 
1185 

55 



cagacctccc tctgtttctc agagtctatt 
cagaaatcca acctagagct gctccgcatc 
cccgtgcagt tcctcaggag tgtcttcgcc 
aacgtctatg acctcctaaa ggacctagag 
gaagatggca gcccccggac tgggcagatc 
aactcacaca acgatgacgc actactcaag 
gacatggaca aggtcgagac attcctgcgc 
tgtggcttct catggggtgg tggaggaagt 
gacaacgcta tgctccgcgc ccatcgtctg 
tttgaagaag cctatatccc aaaggaacag 
tccctctgtt tctcagagtc tattccgaca 
tccaacctag agctgctccg catctccctg 
cagttcctca ggagtgtctt cgccaacagc 
tatgacctcc taaaggacct agaggaaggc 
ggcagccccc ggactgggca gatcttcaag 
cacaacgatg acgcactact caagaactac 
gacaaggtcg agacattcct gcgcatcgtg 
ttctcatgag gatcc 



103 



<210> 51 

<211> 391 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<220> 

<221> mat_j)eptide 

<222> (1) . . 0 

<400> 51 

Met Phe Pro Thr lie Pro' Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 S 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 

20 25 . 30 

Glu Glu Ala Tyr lie Pro Lys GlU Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 " 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 • 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 125 

Leu Glu Asp Gly Ser Pro' Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 



104 



Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Ser Trp Gly Gly Gly Gly Ser Phe Pro Thr lie Pro Leu Ser Arg Leu 
195 200 205 

Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe 
210 215 220 

Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys 
225 230 235 240 

Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser 
245 250 255 

lie Pro Thr Pro Ser Asn. Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu 
260 265 270 

Glu Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro 
275 280 285 

Val Gin. Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala 
290 295 300 

Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie 
305 310 315 320 

Gin Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin 
325 330 335 

lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp 
340 345 350 

Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp 
355 360 365 

Met Asp Lys Val Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val 
370 375 380 

Glu Gly Ser Cys Gly Phe Ser 
385 390 

<210> 52 

<211> 1779 

<212> DNA 

<213> Artificial 



<220> 

<223> synthetic sequence 
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<220> 

<221> tnisc_f eature 

5 <222> (1174) . . (1767) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 

10 



<400> 52 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
15 60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 

20 gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 

tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 



25 



aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
30 360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 

35 ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 



40 



55 



aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

atcgtgcagt gccgctctgt ggagggcagc tgtggcttct catggggtgg tggaggaagt 
600 



ttcccaacca ttcccttatc caggcttttt gacaacgcta tgctccgcgc ccatcgtctg 
45 660 

caccagctgg cctttgacac ctaccaggag tttgaagaag cctatatccc aaaggaacag 
720 

50 aagtattcat tcctgcagaa cccccagacc tccctctgtt tctcagagtc tattccgaca 
780 



ccctccaaca gggaggaaac acaacagaaa tccaacctag agctgctccg catctccctg 
840 

ctgctcatcc agtcgtggct ggagcccgtg cagttcctca ggagtgtctt cgccaacagc 
900 



106 



ctggtgtacg gcgcctctga cagcaacgtc tatgacctcc taaaggacct agaggaaggc 
960 



5 atccaaacgc tgatggggag gctggaagat ggcagccccc ggactgggca gatcttcaag 
1020 



10 



25 



40 



cagacctaca gcaagttcga cacaaactca cacaacgatg acgcactact caagaactac 
1080 

gggctgctct actgcttcag gaaggacatg gacaaggtcg agacattcct gcgcatcgtg 
1140 



cagtgccgct ctgtggaggg cagctgtggc ttctcatggg gtggtggagg aagtttccca 
15 1200 

aceattccct tatccaggct ttttgacaac gctatgctcc gcgcccatcg tctgcaccag 
1260 

20 ctggcctttg acacctacca ggagtttgaa gaagcctata tcccaaagga acagaagtat 
1320 



tcattcctgc agaaccccca gacctccctc tgtttctcag agtctattcc gacaccctcc 
1380 

aacagggagg aaacacaaca gaaatccaac ctagagctgc tccgcatctc cctgctgctc 
1440 



atccagtcgt ggctggagcc cgtgcagttc ctcaggagtg tcttcgccaa cagcctggtg 
30 1500 

tacggcgcct ctgacagcaa cgtctatgac ctcctaaagg acctagagga aggcatccaa 
1560 

35 acgctgatgg ggaggctgga agatggcagc ccccggactg ggcagatctt caagcagacc 
1620 



tacagcaagt tcgacacaaa ctcacacaac gatgacgcac tactcaagaa ctacgggctg 
1680 

ctctactgct tcaggaagga catggacaag gtcgagacat tcctgcgcat cgtgcagtgc 
174 0 



cgctctgtgg agggcagctg tggcttctca tgaggatcc 
45 1779 



<210> 53 

50 <211> 589 

<212> PRT 

<213> Artificial 



55 



107 



<220> 

<223> synthetic sequence 
<220> 

<:221> MISC_FEATURE 

<222> (391) . . (588) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<220> 

<221> mat_peptide 
<222> (1) . . 0 

<400> 53 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 



Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 

115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 
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Tyr Gly Leu Leu Tyr Cys . Phe Arg 
165 

Phe Leu Arg lie Val Gin Cys Arg 
180 

Ser Trp Gly Gly Gly Gly Ser Phe 
195 200 

Phe Asp Asn Ala Met Leu Arg Ala 
210 215 



Lys Asp Met Asp Lys Val Glu Thr 
170 175 

Ser Val Glu Gly Ser Cys Gly Phe 
185 190 

Pro Thr lie Pro Leu Ser Arg Leu 
205 

His Arg Leu His Gin Leu Ala Phe 
220 



Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys 
225 230 235 240 

Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser 
245 250 255 

lie Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu 
260 265 270 . 

Glu Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser. Trp Leu Glu Pro 
275 280 285 

Val Gin Phe Leu Airg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala 
290 .295 300 

Ser Asp Ser- Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie 
305 310 315 320 

Gin Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin 
325 330 335 

lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp 
340 345 350 

Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp 
355 360 365 

Met Asp Lys Val Glu. Thr Phe Leu Arg lie Val Gin Cys Arg Ser Val 
370 375 380 

Glu Gly Ser Cys Gly Phe Ser Trp Gly Gly Gly Gly Ser Phe Pro Thr 
385 390 395 400 

lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg 
405 410 415 

Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr 
420 425 430 



lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser 

435 . 440 445 

Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu Glu Thr 

450 455 460 
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Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu Leu lie 
465 470 475 480 



10 



25 



45 



50 



55 



Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe Ala Asn 
485 490 495 

Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys 
500 505 510 

Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu Asp Gly 
515 520 525 



Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp 
15 530 535 540 

Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu 
545 550 555 560 

20 Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu Arg lie 

565 570 575 



Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Ser 
580 585 



<210> 54 

<211> 2370 

30 <212> DNA 

<213> Artificial 

35 

<220> 

<223> synthetic sequence 
40 <220> 

<221> misc_feature 

<222> (1174) . , (1767) 



<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 54 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 



110 



gaacagaagt attcattcct gcagaacccc 
180 

5 ccgacaccct ccaacaggga ggaaacacaa 
240 

tccctgctgc tcatccagtc gtggctggag 
300 

10 

aacagcctgg tgtacggcgc ctctgacagc 
360 

gaaggcatcc aaacgctgat ggggaggctg 
15 420 

ttcaagcaga cctacagcaa gttcgacaca 
480 

20 aactacgggc tgctctactg cttcaggaag 
540 

atcgtgcagt gccgctctgt ggagggcagc 
600 

25 

ttcccaacca ttcccttatc caggcttttt 
660 

caccagctgg cctttgacac ctaccaggag 
30 720 

aagtattcat tcctgcagaa cccccagacc 
780 

35 ccctccaaca gggaggaaac acaacagaaa 
840 

ctgctcatcc agtcgtggct ggagcccgtg 
900 

40 

ctggtgtacg gcgcctctga cagcaacgtc 
960 

atccaaacgc tgatggggag gctggaagat 
45 1020 

cagacctaca gcaagttcga cacaaactca 
1080 

50 gggctgctct actgcttcag gaaggacatg 
1140 

cagtgccgct ctgtggaggg cagctgtggc 
1200 

55 

accattccct tatccaggct ttttgacaac 
1260 



cagacctccc tctgtttctc agagtctatt 
cagaaatcca acctagagct gctccgcatc 
cccgtgcagt tcctcaggag tgtcttcgcc 
aacgtctatg acctcctaaa ggacctagag 
gaagatggca gcccccggac tgggcagatc 
aactcacaca acgatgacgc actactcaag 
gacatggaca aggtcgagac attcctgcgc 
tgtggcttct catggggtgg tggaggaagt 
gacaacgcta tgctccgcgc ccatcgtctg 
tttgaagaag cctatatccc aaaggaacag 
tccctctgtt tctcagagtc tattccgaca 
tccaacctag agctgctccg catctccctg 
cagttcctca ggagtgtctt cgccaacagc 
tatgacctcc taaaggacct agaggaaggc 
ggcagccccc ggactgggca gatcttcaag 
cacaacgatg acgcactact caagaactac 
gacaaggtcg agacattcct gcgcatcgtg 
ttctcatggg gtggtggagg aagtttccca 
gctatgctcc gcgcccatcg tctgcaccag . 



Ill 



ctggcctttg acacctacca ggagtttgaa 

1320 

5 tcattcctgc agaaccccca gacctccctc 
1380 

aacagggagg aaacacaaca gaaatccaac 
1440 

10 

atccagtcgt ggctggagcc cgtgcagttc 
1500 

tacggcgcct ctgacagcaa cgtctatgac 
15 1560 

acgctgatgg ggaggctgga agatggcagc 
1620 . 

20 tacagcaagt tcgacacaaa ctcacacaac 
1680 

ctctactgct tcaggaagga catggacaag 
1740 

25 

cgctctgtgg agggcagctg tggcttctca 
1800 

cccttatcca ggctttttga caacgctatg 
30 1860 

tttgacacct accaggagtt tgaagaagcc 
1920 

35 ctgcagaacc cccagacctc cctctgtttc 
1980 

gaggaaacac aacagaaatc caacctagag 
2040 

40 

tcgtggctgg agcccgtgca gttcctcagg 
2100 

gcctctgaca gcaacgtcta tgacctccta 
45 2160 

atggggaggc tggaagatgg cagcccccgg 
2220 

50 aagttcgaca caaactcaca caacgatgac 
2280 

tgcttcagga aggacatgga caaggtcgag 
2340 

55 

gtggagggca gctgtggctt ctagggatcc 
2370 



gaagcctata tcccaaagga acagaagtat 
tgtttctcag agtctattcc gacaccctcc 
ctagagctgc tccgcatctc cctgctgctc 
ctcaggagtg tcttcgccaa cagcctggtg 
ctcctaaagg acctagagga aggcatccaa 
ccccggactg ggcagatctt caagcagacc 
gatgacgcac tactcaagaa ctacgggctg 
gtcgagacat tcctgcgcat cgtgcagtgc 
tggggtggtg gaggaagttt cccaaccatt 
ctccgcgccc atcgtctgca ccagctggcc 
tatatcccaa aggaacagaa gtattcattc 
tcagagtcta ttccgacacc ctccaacagg 
ctgctccgca tctccctgct gctcatccag 
agtgtcttcg ccaacagcct ggtgtacggc 
aaggacctag aggaaggcat ccaaacgctg 
actgggcaga tcttcaagca gacctacagc 
gcactactca agaactacgg gctgctctac 
acattcctgc gcatcgtgca gtgccgctct 
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<210> 55 

<211> 786 

<212> PRT 

<213> Artificial 



<220> 

<223> synthetic sequence 
<220> 

<221> MISC_FEATURE 

<222> (391) . . (588) 

<220> 

<221> mat_peptide 
<222> (1)..() 

<400> 55 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 

1 5 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 



Arg Glu Glu Thr Gin Gin Lys Ser 
65 70 

Leu Leu Leu lie Gin Ser Trp Leu 
85 

Val Phe Ala Asn Ser Leu Val Tyr 
100 



Asn Leu Glu Leu Leu Arg lie Ser 
75 80 

Glu Pro Val Gin Phe Leu Arg Ser 
90 . 95 

Gly Ala Ser Asp Ser Asn Val Tyr 
105 110 



Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 



113 



115 



120 



125 



Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
. 165 170 175 

Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Ser Trp Gly Gly Gly Gly Ser Phe Pro Thr He Pro Leu Ser Arg Leu 
195 200 205 

Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe 

210 215 220 

Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr He Pro Lys Glu Gin Lys 
225 230 235 240 

Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser 
245 250 255 

He Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu 
260 265 270 

Glu Leu Leu Arg He Ser Leu Leu Leu He Gin Ser Trp Leu Glu Pro 
275 280 285 

Val Gin Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala 
290 295 300 

Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly He 
305 310 315 320 

Gin Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin 
325 330 335 

He Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp 
340 345 350 

Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp 
355 360 365 

Met Asp Lys Val Glu Thr Phe Leu Arg He Val Gin Cys Arg Ser Val 
370 375 380 

Glu Gly Ser Cys Gly Phe Ser Trp Gly Gly Gly Gly Ser Phe Pro Thr 
385 390 395 400 

He Pro Leu Ser Arg. Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg 
405 410 415 

Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr. 
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420 



425 



430 



lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser 
435 440 445 

Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu Glu Thr 
450 455 ^ 460 

Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu Leu lie 
465 470 475 480 

Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe Ala Asn 
485 490 495 

Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys. 
500 505 510 

Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu Asp Gly 
515 520 525 

Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp 
530 535 540 

Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu 
545 550 - 555, 560. 

Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu Arg lie 
565 570 ' 575 

Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Ser Trp Gly Gly. 
580 585 590 

Gly Gly Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala 
595 600 605 

Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin 
610 615 620 

Glu Phe Glu Glu Ala Tyir lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu 
625 630 635 640 

Gin Asn Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro 
645 650 655 

Ser Asn Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg . 
660 665 670 

lie Ser Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu 
675 680 685 

Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn 
690 695 700 

Val Tyr Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met 
705 710 715 720 

Gly Arg Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin 



115 



725 730 735 

Thr Tyr Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu 
740 745 750 

5 

Lys Asn Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val 
755 760 765 

Glu Thr Phe Leu Arg lie Val Gin Cys Arg Ser ValGlu Gly Ser Cys 
10 770 775 780 

Gly Phe 
785 

15 <210> 56 

<211> 33 

<212> DNA 

20 

<213> Artificial 



25 <220> 

<223> synthetic sequence 

<400> 56 . 
30 ttaccatgga ttgccggcgg cggcggatcc aat 
33 



35 



<210> 57 

<211> 36 

<212> DNA 

40 <213> Artificial 



45 



<220> 

<223> synthetic sequence 



<400> 57 

ttaccatgga tttgatcagg cggcggcgga tccaat 
50 36 



<210> 58 
55 <211> 36 
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10 



<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 

<400> 58 

tgatcaggcg gcggcggatc aggcggcggc ggatcc 
36 

15 

<210> 59 

<211> 10 

20 <212> PRT 

<213> Artificial 

25 

<220> 

<223> synthetic sequence 

30 <400> 59 

Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
1 5 10 

35 <210> 60 

<211> 48 

<212> DNA 

40 

<213> Artificial 



45 <220> 

<2 23> synthetic sequence 

<400> 60 

50 gccggcggcg gcggatcagg cggcggcgga tcaggcggcg gcggatcc 
48 



<210> 61 

55 

<211> 14 
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<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 61 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
1 5 10 

<210> 62 

<211> 43 

<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 62 

ggacatatgc tgtgatcatt cccaacc'att cccttatcca ggc 
43 

63 

41 ' 
DNA 

Artificial 
<220> 

<223> synthetic sequence 
<400> 63 

cgcgaattcg atccatggaa gccacagctg ccctccacag a 
41 

<210> 64 
<211> 36 



<210> 
<211> 
<212> 
<213> 
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<212> 
<213> 



DNA 

Artificial 



10 



<220:> 

<223> synthetic sequence 
<400> 64 

cgcgtcgacc tagaagccac agctgccctc cacaga 
36 

15 

<210> 65 
<211> 602 
20 <212> DNA 

<213> Artificial 

25 

<220> 

<223> synthetic sequence 
30 <400> 65 

catatgctgt gatcattccc aaccattccc ttatccaggc tttttgacaa cgctatgctc 
60 

cgcgcccatc gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat 
35 120 

atcccaaagg aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca 
180 

40 gagtctattc cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg 
240 

ctccgcatct ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt 
300 

gtcttcgcca acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag 
360 



45 



gacctagagg aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact 
50 420 

gggcagatct tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca 
480 

55 ctactcaaga actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca 
540 
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ttcctgcgca tcgtgcagtg ccgctctgtg gagggcagct gtggcttcca tggatcgaat 
600 



tc 
602 



<210> 66 

<211> 192 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 66 

Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1 5 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 . 80 

Leu Leu Leu lie Gin. Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 " 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 . 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 



Ser Lys Phe Asp Thr Asn Ser His 
145 150 

Tyr Gly Leu Leu Tyr Cys Phe Arg 
165 

Phe Leu Arg lie Val Gin Cys Arg 



Asn Asp Asp Ala Leu Leu Lys Asn 
155 160 

Lys Asp Met Asp Lys Val Glu Thr 
170 175 

Ser Val Glu Gly Ser Cys Gly Phe 



120 



10 



20 



35 



50 



180 185 190 

<210> 67 
<211> 600 
<212> DNA 
<213> Artificial 



<220> 

15 <223> synthetic sequence 



<400> 67 

catatgctgt gatcattccc aaccattccc ttatccaggc tttttgacaa cgctatgctc 
60 

cgcgcccatc gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat 
120 



atcccaaagg aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca 
25 180 

gagtctattc cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg 
240 

30 ctccgcatct ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt 
300 



gtcttcgcca acagcctggt gtacggcgcc tctgacagca acgtetatga cctcctaaag 
360 

gacctagagg aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact 
420 



gggcagatct tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca 
40 480 

ctactcaaga actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca 
540 

45 ttcctgcgca tcgtgcagtg ccgctctgtg gagggcagct gtggcttcta ggtcgacgcg 
600 



<210> 68 

<211> 192 

<212> PRT 

55 <213> Artificial 
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<220> 

<223> synthetic sequence 
<400> 68 

Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 .125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

<210> 69 
<211> 639 
<212> DNA 
<213> Artificial 



<220> 

<223> synthetic sequence 



122 



<400> 69 

catatgctgt gatcattccc aaccattccc ttatccaggc tttttgacaa cgctatgctc 
60 

5 

cgcgcccatc gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat 
120 

atcccaaagg aacagaagta ttcattcctg cagaaccGcc agacctccct ctgtttctca 
10 180 

gagtctattc cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg 
240 

15 ctccgcatct ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt 
300 



20 



35 



40 



45 



50 



55 



gtcttcgcca acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag 
360 

gacctagagg aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact 
420 



gggcagatct tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca 
25 480 

ctactcaaga actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca 
540 

30 ttcctgcgca tcgtgcagtg ccgctctgtg gagggcagct gtggcttcgg cggcggcgga 
600 



tcaggcggcg gcggatcagg cggcggcgga tccgaattc 
639 



<210> 70 

<211> 206 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 70 

Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
1.5 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 
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Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 . 45 



Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 . 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg 
115 . 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
195 200 . 205 

<210> 71 

<211> 630 

<212> DNA 

<213> Artificial 



<220> 



<223> synthetic sequence 
<400> 71 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 
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gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 



ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
5 240 

tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 

10 aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 

15 

ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
20 540 

atcgtgcagt gccgctctgt ggagggcagc tgtggcttcg gcggcggcgg atcaggcggc 
600 

25 ggcggatcag gcggcggcgg. atccgaattc 
630 



<210> 72 

30 

<211> 206 

<212> PRT 

35 <213>. Artificial 



<220> 

40 

<223> synthetic sequence 
<400> 72 

45 Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

Arg Ala His Arg Leu, His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

50 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
55 50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg . lie Ser 



125 



65 



70 



75 



80 



Leu Leu Leu He Gin 
85 

Val Phe Ala Asn Ser 
100 

Asp Leu Leu Lys Asp 
115 

Leu Glu Asp Gly Ser 
130 

Ser Lys Phe Asp Thr 
145 

Tyr Gly Leu Leu Tyr 
165 



Ser Trp Leu Glu Pro 
90 

Leu Val Tyr Gly Ala 
105 

Leu Glu Glu Gly He 
120 

Pro Arg Thr Gly Gin 
135 

Asn Ser His Asn Asp 
150 

Cys Phe Arg Lys Asp 

170 



Val Gin Phe Leu Arg Ser 
95 

Ser Asp Ser Asn Val Tyr 
HQ 

Gin Thr Leu Met Gly Arg 
125 

He Phe Lys Gin Thr Tyr 
140 

Asp Ala Leu Leu Lys Asn 
155 160 

Met Asp Lys Val Glu Thr 

175 



Phe Leu Arg He Val Gin Cys Arg 
180 

Gly Gly Gly Gly Ser Gly Gly Gly 
195 200 

<210> 73 

<211> 1248 

<212> DNA 

<213> Artificial 



Ser Val Glu Gly Ser Cys Gly Phe 
185 190 

Gly Ser Gly Gly Gly Gly 
205 



<220> 

<223> synthetic sequence 
<220> 

<221> miscf eature 
<222> (619) . . (1236) 

<223> sequence is repeated N-1 times, where N is a positive whol 
numbe 



<400> 73 

tgatcattcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 
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cgtctgcacc agctggcctt tgacaectac 
120 

gaacagaagt attcattcct gcagaacccc 
5 180 

ccgacaccct ccaacaggga ggaaacacaa 
240 

10 tccctgctgc tcatccagtc gtggctggag 
300 

aacagcctgg tgtacggcgc ctctgacagc 
360 

15 

gaaggcatcc aaacgctgat ggggaggctg 
420 

ttcaagcaga cctacagcaa gttcgacaca 
20 480 

aactacgggc tgctctactg cttcaggaag 
540 

25 atcgtgcagt gccgctctgt ggagggcagc 
600 

ggcggatcag gcggcggcgg atcattccca 
660 
30 . 

gctatgctcc gcgcccatcg tctgcaccag 
720 

gaagcctata tcccaaagga acagaagtat 
35 780 

tgtttctcag agtctattcc gacaccctcc 
840 

40 ctagagctgc tccgcatctc cctgctgctc 
900 

ctcaggagtg tcttcgccaa cagcctggtg 
960 

45 

ctcctaaagg acctagagga aggcatccaa 
1020 

ccccggactg ggcagatctt caagcagacc 
50 1080 

gatgacgcac tactcaagaa ctacgggctg 
1140 

55 gtcgagacat tcctgcgcat cgtgcagtgc 
- 1200 



caggagtttg aagaagccta tatcccaaag 
cagacctccc tctgtttctc agagtctatt 
cagaaatcca acctagagct gctccgcatc 
cccgtgcagt tcctcaggag tgtcttcgcc 
aacgtctatg acctcctaaa ggacctagag 
gaagatggca gcccccggac tgggcagatc 
aactcacaca acgatgacgc actactcaag 
gacatggaca aggtcgagac attcctgcgc 
tgtggcttcg gcggcggcgg atcaggcggc 
accattccct tatccaggct ttttgacaac 
ctggcctttg acacctacca ggagtttgaa 
tcattcctgc agaaccccca gacctccctc 
aacagggagg aaacacaaca gaaatccaac 
atccagtcgt ggctggagcc cgtgcagttc 
tacggcgcct ctgacagcaa cgtctatgac 
acgctgatgg ggaggctgga agatggcagc 
tacagcaagt tcgacacaaa ctcacacaac 
ctctactgct tcaggaagga catggacaag 
cgctctgtgg agggcagctg tggcttcggc 
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ggcggcggat caggcggcgg cggatcaggc ggcggcggat ccgaattc 
124 8 



<210> 74 

<211> 412 

<212> PRT 

<212> Artificial 

<220> 

<223> synthetic sequence 
<220> 

<221> MISC_FEATURE 

<222> (193) . . (398) 

<223> sequence is repeated N-1 times, where N is a positive whole 
nutnbe 



<400> 74 

Ser Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leii lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
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130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly. Gly Gly Ser Phe 
195 200 205 

Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala 
210 215 220 

His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu 
225 230 235 240 

Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin 
245 250 255 

Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu 
260 265 270 

Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu 
275 280 285 

Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe 
290 295. 300 

Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu 
305 310 315 320 

Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu 
325 330 335 

Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys 
340 345 350 

Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly 
355 360 365 

Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu 
370 375 * 380 

Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Gly Gly 
385 390 395 400 

Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
405 410 

« <210> 75 
<211> 2445 
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10 



<212> DNA 
<213> Artificial 

<220> 

<223> synthetic sequence 
<220> 

<221> misc_feature 

15 <222> (1237) . . (1854) 

<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 

20 



<400> 75 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
25 60 

' cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 

30 gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 

ccgacacdct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 



35 



50 



tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 



aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
40 360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 

45 ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 



aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

atcgtgcagt gccgctctgt ggagggcagc tgtggcttcg gcggcggcgg atcaggcggc 
600 



ggcggatcag gcggcggcgg atcattccca accattccct tatccaggct ttttgacaac 
55 660 
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gctatgctcc gcgcccatcg tctgcaccag ctggcctttg acacctacca ggagtttgaa 
720 



gaagcctata tcccaaagga acagaagtat 
5 780 

tgtttctcag agtctattcc gacaccctcc 
840 

10 ctagagctgc tccgcatctc cctgctgctc 
900 

ctcaggagtg tcttcgccaa cagcctggtg 
960 

15 

ctcctaaagg acctagagga aggcatccaa 
1020 

ccccggactg ggcagatctt caagcagacc 

20 1080 

gatgacgcac tactcaagaa. ctacgggctg 
1140 

25 gtcgagaeat tcctgcgcat cgtgcagtgc 
1200 

ggcggcggat caggcggcgg cggatcaggc 
1260 

30 

tccaggcttt ttgacaacgc tatgctccgc 
1320 

acctaccagg agtttgaaga agcctatatc 
35 1380 

aacccccaga cctccctctg tttctcagag 
1440 

40 acacaacaga aatccaacct agagctgctc 
1500 

ctggagcccg tgcagttcct caggagtgtc 
1560 

45 

gacagcaacg tctatgacct cctaaaggac 
1620 

aggctggaag atggcagccc ccggactggg 
50 1680 

gacacaaact cacacaacga tgacgcacta 
1740 

55 aggaaggaca tggacaaggt cgagacattc 
1800 



tcattcctgc agaaccccca gacctccctc 
aacagggagg aaacacaaca gaaatccaac 
atccagtcgt ggctggagcc cgtgcagttc 
tacggcgcct ctgacagcaa cgtctatgac 
acgctgatgg ggaggctgga agatggcagc 
tacagcaagt tcgacacaaa ctcacacaac 
ctctactgct tcaggaagga catggacaag 
cgctctgtgg agggcagctg tggcttcggc 
ggcggcggat cattcccaac cattccctta 
gcccatcgtc tgcaccagct ggcctttgac 
ccaaaggaac agaagtattc attcctgcag 
tctattccga caccctccaa cagggaggaa 
cgcatctccc tgctgctcat ccagtcgtgg 
ttcgccaaca gcctggtgta cggcgcctct 
ctagaggaag gcatccaaac gctgatgggg 
cagatcttca agcagaccta cagcaagttc 
ctcaagaact acgggctgct ctactgcttc 
ctgcgcatcg tgcagtgccg ctctgtggag 
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ggcagctgtg gcttcggcgg cggcggatca ggcggcggcg gatcaggcgg cggcggatca 
1860 

ttcccaacca ttcccttatc caggcttttt gacaacgcta tgctccgcgc ccatcgtctg 
5 1920 

caccagctgg cctttgacac ctaccaggag tttgaagaag cctatatccc .aaaggaacag 
1980 

10 aagtattcat tcctgcagaa cccccagacc tccctctgtt tctcagagtc tattccgaca 
2040 



15 



30 



40 



ccctccaaca gggaggaaac acaacagaaa tccaacctag agctgctccg catctccctg 
2100 

ctgctcatcc agtcgtggct ggagcccgtg cagttcctca ggagtgtctt cgccaacagc 
2160 



.ctggtgtacg gcgcctctga cagcaacgtc tatgacctcc taaaggacct agaggaaggc 
20 2220 

atccaaacgc tgatggggag gctggaagat ggcagccccc ggactgggca gatcttcaag 
2280 

25 cagacctaca gcaagttcga cacaaactca cacaacgatg acgcactact caagaactac 
2340 



gggctgctct actgcttcag gaaggacatg gacaaggtcg agacattcct gcgcatcgtg 
2400 

cagtgccgct ctgtggaggg cagctgtggc ttctaggtcg acgcg 
2445 



35 <210> 76 

<211> 810 

<212> PRT 

<213> Artificial 



45 <220> 

<223> synthetic sequence 
<220> 

50 

<221> MISC_FEATURE 

<222> (412) . . (617) 

55 <223> sequence is repeated N-1 times, where N is a positive whole 
numbe 
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<220> 
<221> 
<222> 



mat ^peptide 
(1) . . 0 



<400.> 76 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Glh Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
65 70 75 80 

Leu Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 v90 95 

Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg 
115 120 125 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 

165 170 175 

Phe Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Phe 
195 200 205 

Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala 
210 215 220 

His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu 
225 230 235 240 
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Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin 
245 250 255 



Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu 
260 265 270 

Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu 
275 280 285 

Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe 
290 295 300 

Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu 
305 310 315 320 

Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu 
325 330 335 

Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys 
340 345 350 

Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly 
355 . 36.0 365 

Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu 
370 375 380 

Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Gly. Gly 
385 390 395 400 

Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Phe Pro Thr 
405 410 415 

lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg 
420 425 430 

Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr 
435 440 445 

lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser 
450- 455 460 

Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu Glu Thr 
465 470 475 480 

Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu Leu lie 
485 490 495 

Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe Ala Asn 
500 505 510 

Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys 
515 520 525 



Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu Asp Gly 
530 535 540 
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Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp 
545 550 555 560 



Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu 
565 570 575 

Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu Arg lie 
580 585 590 

Vai Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe Gly Gly Gly Gly 
595 600 605 

Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Phe Pro Thr lie Pro 
610 615 620 

Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His 
625 630 635 640 

Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro 
645 650 655 

Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro Gin Thr Ser Leu Cys 
660 665 670 

Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg Glu Glu Thr Gin Gin 
675 680 685 

Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu Leu Leu lie Gin Ser 
690 695 700 

Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val Phe Ala Asn Ser Leu 
705 710 715 720 

Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp Leu Leu Lys Asp Leu 
725 730 735 

Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu Glu Asp Gly Ser Pro 
740 745 750 

Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser Lys Phe Asp Thr Asn 
755 760 765 

Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys 

770 775 780 

Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe Leu Arg lie Val Gin 
785 790 795 800 

Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
805 810 

<210> 77 
<211> 593 
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<212> 
<213> 



DNA 

Artificial 



10 



<220> 

<223> synthetic sequence 
<400> 77 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
60 

15 cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 



20 



35 



gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctecgcatc 
240 



tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
25 300 

aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
360 

30 gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 



ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 



atcgtgcagt gccgctctgt ggagggcagc tgtggcttcc atggatcgaa ttc 
40 593 



<210> 78 

45 <211> 192 

<212> PRT 

<213> Artificial 

50 

<220> 

55 <223> synthetic sequence 

<400> 78 
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Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
15 10 15 

5 Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
20 25 30 

Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
35 40 45 

10 

Pro Gin Thr Ser Leu Gys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn 
.50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser 
15 65 70 75 80 

Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
'85 90 95 

20 Val Phe Ala Asri Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg 

115 120 125 

25 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
30 145 150 155 - 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

35 Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

<210> 79 
40 <211> 592 
<212> DNA 
<213> Artificial 

45 



<220> 

50 <223> synthetic sequence 
<400> 79 

aagctttccc aaccattccc ttatccaggc tttttgacaa cgctatgctc cgcgcccatc 
60 

55 

gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat atcccaaagg 
120 
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10 



25 



30 



35 



40 



45 



aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca gagtctattc 
180 

cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg ctccgcatct 
240 

ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt gtcttcgcca 
300 

acagcctggt. gtacggcgcc tctgacagca - acgtctatga cctcctaaag gacctagagg 
360 



aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact gggcagatct 
15 420 

tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca ctactcaaga 
■ 480 

20 actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca ttcctgcgca 
540 



tcgtgcagtg ccgctctgtg gagggcagct gtggcttcca tggatcgaat tc 
592 



<210> 80 

<211> 191 

<212> PRT 

<213> Artificial 

<220> 

<223> synthetic sequence 
<400> 80 

Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg 
1 5 10 15 

Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu 
20 25 30 



Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro 
50 35 40 45 

Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg 
50 55 60 

55 Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
65 70 75 80 
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Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val 
85 90 95 



Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp 
5 100 105 110 

Leu Leu Lys Asp. Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg Leu 
115 120 125 

10 Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr Ser 

130 135 . 140 



15 



25 



30 



35 



45 



Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr 

145 150 155 ' 160 

Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe 

165 170 175 



Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
20 180 185 190 



<210> 81 

<211> 587 

<212> DNA 

<213> Artificial 

<220> 

<223> synthetic sequence 



<400> 81 

aagctttccc aaccattccc ttatccaggc tttttgacaa cgctatgctc cgcgcccatc 
60 

40 gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat atcccaaagg 
120 

aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca gagtctattc 
180 



cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg ctccgcatct 
240 



ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt gtcttcgcca 
50 300 

acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag gacctagagg 
360 

55 aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact gggcagatct 
420 
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tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca ctactcaaga 
480 



actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca ttcctgcgca 
5 540 

tcgtgcagtg ccgctctgtg gagggcagct gtggcttcta gggatcc 
587 

10 

<210> 82 
<211> 191 
15 <212> PRT 

<213> Artificial 

20 

<220> 

<223> synthetic sequence 
25 <400> 82 

Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg 
1 5 10 15 

30 Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu 
20 25 30 

Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro 
35 40 45 

35 

Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg 
50 55 60 

Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
40 65 70 75 80 

Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val 
85 90 95 



45 Phe Ala Asn Ser Leu Val Tyr 
100 

Leu Leu Lys Asp Leu Glu Glu 
115 

50 

Glu Asp Gly Ser Pro Arg Thr 
130 135 

Lys Phe Asp Thr Asn Ser His 
55 145 150 

Gly Leu Leu Tyr Cys Phe Arg 



Gly Ala Ser Asp Ser Asn Val Tyr Asp 
105 110 

Gly lie Gin Thr Leu Met Gly Arg Leu 
120 125 

Gly Gin lie Phe Lys Gin Thr Tyr Ser 
140 

Asn Asp Asp Ala Leu Leu Lys Asn Tyr 
155 160 

Lys Asp Met Asp Lys Val Glu Thr Phe 
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165 170 175 

Leu Arg lie Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 

5 

<210> 83 

<211> 1165 

10 <212> DNA 

<213> Artificial 

15 

<220> 

<223> synthetic sequence 
20 <220> 

<221> Tnisc_f eature 

<222> (579) . . (1151) 



25 



30 



35 



45 



<223> sequence is repeated N-1 times, where N is a positive whole 
numbe 



<400> 83 

aagctttccc aaccattccc ttatccaggc tttttgacaa cgctatgctc cgcgcccatc 
60 

gtctgcacca gctggccttt gacacctacc aggagtttga agaagcctat atcccaaagg 
120 



40 aacagaagta ttcattcctg cagaaccccc agacctccct ctgtttctca gagtctattc 
180 



cgacaccctc caacagggag gaaacacaac agaaatccaa cctagagctg ctccgcatct 
240 

ccctgctgct catccagtcg tggctggagc ccgtgcagtt cctcaggagt gtcttcgcca 
300 



acagcctggt gtacggcgcc tctgacagca acgtctatga cctcctaaag gacctagagg 
50 360 

aaggcatcca aacgctgatg gggaggctgg aagatggcag cccccggact gggcagatct 
420 

55 tcaagcagac ctacagcaag ttcgacacaa actcacacaa cgatgacgca ctactcaaga 
480 
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actacgggct gctctactgc ttcaggaagg acatggacaa ggtcgagaca ttcctgcgca 
540 

tcgtgcagtg ccgctctgtg gagggcagct gtggcttctt cccaaccatt cccttatcca 
5 600 

ggctttttga caacgctatg ctccgcgccc atcgtctgca ccagctggcc tttgacacct 
660 

10 accaggagtt tgaagaagcc tatatcccaa aggaacagaa gtattcattc ctgcagaacc 
720 

cccagacctc cctctgttfcc tcagagtcta ttccgacacc ctccaacagg gaggaaacac 
780 

15 

aacagaaatc caacctagag ctgctccgca tctccctgct gctcatccag tcgtggctgg 
840 

agcccgtgca gttcctcagg agtgtcttcg ccaacagcct ggtgtacggc gcctctgaca 
20 900 

gcaacgtcta tgacctccta aaggacctag aggaaggcat ccaaacgctg atggggaggc 
960 

25 tggaagatgg cagcccccgg actgggcaga tcttcaagca gacctacagc aagttcgaca 
1020 

caaactcaca caacgatgac gcactactca agaactacgg gctgctctac tgcttcagga 
1080 

30 

aggacatgga caaggtcgag acattcctgc gcatcgtgca gtgccgctct gtggagggca 
1140 

gctgtggctt ccatggatcg aattc 
35 1165 ' 



84 

191 
PRT 

Artificial 



synthetic sequence 



MISC_FEATURE 
(1) . . (191) 



40 



45 



<210> 
<211> 
<212> 
<2i3> 



<220> 
50 <223> 
<220> 
<221> 

55 

<222> 
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<223> sequence is repeated N times, where N is a positive whole 



<400> 84 



Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu Arg 
1 5 10 15 

Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu 
20 25 30 

Glu Ala Tyr lie Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn Pro 
35 40 45 

Gin Thr Ser Leu Cys Phe Ser Glu Ser lie Pro Thr Pro Ser Asn Arg 
50 55 60 

Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg lie Ser Leu 
65 70 . 75 80 

Leu Leu lie Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser Val 
85 90 95 

Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp 
100 105 110 

Leu Leu Lys Asp Leu Glu Glu Gly lie Gin Thr Leu Met Gly Arg Leu 
115 120 125 

Glu Asp Gly Ser Pro Arg Thr Gly Gin lie Phe Lys Gin Thr Tyr Ser 
130 135 140 

Lys. Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn Tyr 
145 150 155 160 



Gly Leu Leu Tyr Cys Phe Arg Lys 
165 

Leu Arg lie Val Gin Cys Arg Ser 
180 

<210> 85 
<211> 2307 
<212> DNA 
<213> Artificial 

<220> 

<223> synthetic sequence 



Asp Met Asp Lys Val Glu Thr Phe 

170 175 

Val Glu Gly Ser Cys Gly Phe 
185 190 
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misc_f eature 
(1153) . . (1725) 

sequence is repeated N-1 times, where N is a positive whole 



<400> 85 

catatgttcc caaccattcc cttatccagg ctttttgaca acgctatgct ccgcgcccat 
15 60 

cgtctgcacc agctggcctt tgacacctac caggagtttg aagaagccta tatcccaaag 
120 

20 gaacagaagt attcattcct gcagaacccc cagacctccc tctgtttctc agagtctatt 
180 . 

ccgacaccct ccaacaggga ggaaacacaa cagaaatcca acctagagct gctccgcatc 
240 

25 

tccctgctgc tcatccagtc gtggctggag cccgtgcagt tcctcaggag tgtcttcgcc 
300 

aacagcctgg tgtacggcgc ctctgacagc aacgtctatg acctcctaaa ggacctagag 
30 360 

gaaggcatcc aaacgctgat ggggaggctg gaagatggca gcccccggac tgggcagatc 
420 

35 ttcaagcaga cctacagcaa gttcgacaca aactcacaca acgatgacgc actactcaag 
480 

aactacgggc tgctctactg cttcaggaag gacatggaca aggtcgagac attcctgcgc 
540 

40 

atcgtgcagt gccgctctgt ggagggcagc tgtggcttct tcccaaccat tcccttatcc 
600 

aggctttttg acaacgctat gctccgcgcc catcgtctgc accagctggc ctttgacacc 
45 660 

taccaggagt ttgaagaagc ctatatccca aaggaacaga agtattcatt cctgcagaac 
720 

50 ccccagacct ccctctgttt ctcagagtct attccgacac cctccaacag ggaggaaaca 
780 

caacagaaat ccaacctaga gctgctccgc atctccctgc tgctcatcca gtcgtggctg 
840 

55 

gagcccgtgc agttcctcag gagtgtcttc gccaacagcc tggtgtacgg cgcctctgac 
900 



10 



<220> 

<221> 

<222> 

<223> 
numbe 
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agcaacgtct atgacctcct aaaggaccta 
960 

5 ctggaagatg gcagcccccg gactgggcag 
1020 

acaaactcac acaacgatga cgcactactc 
1080 

10 

aaggacatgg acaaggtcga gacattcctg 
1140 

agctgtggct tcttcccaac cattccctta 
15 1200 

gcccatcgtc tgcaccagct ggcctttgac 
1260 

20 ccaaaggaac agaagtattc attcctgcag 
1320 

tctattccga caccctccaa cagggaggaa 
1380 

25 

cgcatctccc tgctgctcat ccagtcgtgg 
1440 

ttcgccaaca gcctggtgta cggcgcctct 
30 1500 

ctagaggaag gcatccaaac gctgatgggg 
1560 

35 cagatcttca agcagaccta cagcaagttc 
1620 

ctcaagaact acgggctgct ctactgcttc 
1680 
40 . 

ctgcgcatcg tgcagtgccg ctctgtggag 
174 0 

ttatccaggc tttttgacaa cgctatgctc 
45 1800 

gacacctacc aggagtttga agaagcctat 
1860 

50 cagaaccccc agacctccct ctgtttctca 
1920 

gaaacacaac agaaatccaa cctagagctg 
1980 

55 

tggctggagc ccgtgcagtt cctcaggagt 
2040 



gaggaaggca tccaaacgct gatggggagg 
atcttcaagc agacctacag caagttcgac 
aagaactacg ggctgctcta ctgcttcagg 
cgcatcgtgc agtgccgctc tgtggagggc 
tccaggcttt ttgacaacgc tatgctccgc 
acctaccagg agtttgaaga agcctatatc 
aacccccaga cctccctctg tttctcagag 
acacaacaga aatccaacct agagctgctc 
ctggagcccg tgcagttcct caggagtgtc 
gacagcaacg tctatgacct cctaaaggac 
aggctggaag atggcagccc ccggactggg 
gacacaaact cacacaacga tgacgcacta 
aggaaggaca tggacaaggt cgagacattc 
ggcagctgtg gcttcttccc aaccattccc 
cgcgcccatc gtctgcacca gctggccttt 
atcccaaagg aacagaagta ttcattcctg 
gagtctat.tc cgacaccctc caacagggag 
ctccgcatct ccctgctgct catccagtcg 
gtcttcgcca acagcctggt gtacggcgcc 



145 



tctgacagca acgtctatga cctcctaaag gacctagagg aaggcatcca aacgctgatg 
2100 

5 gggaggctgg aagatggcag cccccggact gggcagatct tcaagcagac ctacagcaag 
2160 

ttcgacacaa actcacacaa cgatgacgca ctactcaaga actacgggct gctctactgc 
2220 



10 



ttcaggaagg acatggacaa ggtcgagaca ttcctgcgca tcgtgcagtg ccgctctgtg 
2280 



gagggcagct gtggcttcta gggatcc 
15 2307 



<210> 86 

20 <211> 192 

<212> PRT 

<213> Artificial 



25 



<220> 

30 <223> synthetic sequence 

<220> 

<221> MISC_FEATURE . 

35 

<222> (2) . . (192) 

<223> sequence is repeated N+2 times, where N is a positive whole 
numbe 

40 



<220> 

45 

<22l> mat_peptide 
<222> (!)..() 

50 

<400> 86 

Met Phe Pro Thr lie Pro Leu Ser Arg Leu Phe Asp Asn Ala Met Leu 
55 1 5 . 10 15 

Arg Ala His Arg Leu His Gin Leu Ala Phe Asp Thr Tyr Gin Glu Phe 
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20 25 30 

Glu Glu Ala Tyr He Pro Lys Glu Gin Lys Tyr Ser Phe Leu Gin Asn 
^ 35 40 45 

Pro Gin Thr Ser Leu Cys Phe Ser Glu Ser He Pro Thr Pro Ser Asn 
50 55 60 

Arg Glu Glu Thr Gin Gin Lys Ser Asn Leu Glu Leu Leu Arg He Ser 
10 65 70 75 80 

Leu Leu Leu He Gin Ser Trp Leu Glu Pro Val Gin Phe Leu Arg Ser 
85 90 95 

15 Val Phe Ala Asn Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr 
100 105 110 

Asp Leu Leu Lys Asp Leu Glu Glu Gly He Gin Thr Leu Met Gly Arg 
115 120 125 

20 

Leu Glu Asp Gly Ser Pro Arg Thr Gly Gin He Phe Lys Gin Thr Tyr 
130 135 140 

Ser Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys Asn 
25 145 150 155 160 

Tyr Gly Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr 
165 170 175 

30 Phe Leu Arg He Val Gin Cys Arg Ser Val Glu Gly Ser Cys Gly Phe 
180 185 190 



35 



40 



45 
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